I. Thermodynamics 


I.A Fundamental definitions 


e Thermodynamics is a phenomenological description of equilibrium properties of macro- 
scopic systems. 

x As a phenomenological description, it is based on a number of empirical observations 
which are summarized by the laws of thermodynamics. A coherent logical and mathe- 
matical structure is then constructed on the basis of these observations, which leads to 
a variety of useful concepts, and to testable relationships among various quantities. The 
laws of thermodynamics can only be justified by a more fundamental (microscopic) theory 
of nature. For example, statistical mechanics attempts to obtain these laws starting from 
classical or quantum mechanical equations for the evolution of collections of particles. 

x A system under study is said to be in equilibrium when its properties do not change 
appreciably with time over the intervals of interest (observation times). The dependence 
on the observation time makes the concept of equilibrium subjective. For example, window 
glass is in equilibrium as a solid over many decades, but flows like a fluid over time scales 
of millennia. At the other extreme, it is perfectly legitimate to consider the equilibrium 
between matter and radiation in the early universe during the first minutes of the big bang. 
x The macroscopic system in equilibrium is characterized by a number of thermodynamic 
coordinates or state functions. Some common examples of such coordinates are pressure 
and volume (for a fluid), surface tension and area (for a film), tension and length (for 
a wire), electric field and polarization (for a dielectric), ---. A closed system is an ide- 
alization similar to a point particle in mechanics in that it is assumed to be completely 
isolated by adiabatic walls that don’t allow any exchange of heat with the surroundings. 
By contrast, diathermic walls allow heat exchange for an open system. In addition to the 
above mechanical coordinates, the laws of thermodynamics imply the existence of other 


equilibrium state functions as described in the following sections. 


iI 


I.B The zeroth law 


The zeroth law of thermodynamics describes the transitive nature of thermal equilib- 
rium. It states: 

e If two systems, A and B, are separately in equilibrium with a third system C, then they 
are also in equilibrium with one another. 

Despite its apparent simplicity, the zeroth law has the consequence of implying the 
existence of an important state function, the empirical temperature 0, such that systems 
in equilibrium are at the same temperature. 

Proof: Let the equilibrium state of systems A, B, and C be described by the coordinates 
{A1, Ao,:--}, {Bi, Bo,---}, and {C1, Co,---} respectively. The assumption that A and C 
are in equilibrium implies a constraint between the coordinates of A and C, i.e. a change in 
A; must be accompanied by some changes in {Ag,---; Ci, Co,---} to maintain equilibrium 


of A and C. Denote this constraint by 

fac(A1, Ag, --+;Ci, C2,---) = 0. (1.1) 
The equilibrium of B and C implies a similar constraint 

fao(Bi, Be,--+;C1,Ca2,---) =0. (1.2) 


Each of the above equations can be solved for C; to yield 


C1 =F4c(Aq, Ag,-+-3Co,---), 


(1.3) 
Ci =F'pc (Bi, Bo,--+;Ca,--+). 
Thus if C is separately in equilibrium with A and B we must have 
Fae Pay Ag, 235 Coy) =F pe Bi, Baye 9 Ca,*)- (1.4) 


However, according to the zeroth law there is also equilibrium between A and B, implying 


the constraint 
fas(A1, Ao,:+:; Bi, Bo,---) =0. (1.5) 


Therefore it must be possible to simplify eq.(I.4) by cancelling the coordinates of C. Hence, 


the condition (1.5) for equilibrium of A and B must be expressible as 


© 4(A1, A2,---) = OB(B1, Ba, ---), (1.6) 


2 


i.e., equilibrium is characterized by a function © of thermodynamic coordinates. This 
function specifies the equation of state, and isotherms of A are described by the condition 
O4(A1, Ag,--:) = 0. 
Example: Consider three systems: (A) a wire of length L with tension F’, (B) a param- 
agnet of magnetization M in a magnetic field B, and (C) a gas of volume V at pressure P. 
Observations indicate that when these systems are in equilibrium, the following constraints 
are satisfied between their coordinates: 

(P i <a) Vai 1,) =e ra So: 


(P+5) (V —b)M —dB =0. 


(1.7) 


Clearly these constraints can be organized into three empirical temperature functions as 


a F B 
— bo) =c(| ——— — K ) =d—. I. 
Oa (P+) (V b) e( a= ) d— (1.8) 
These are the well known equations of state describing: 


(P+a/V?)(V —b) =NkpT (van der Waals gas) 
M = (Nu,B)/(3kgT) (Curie paramagnet) : (1.9) 
F = (K + DT)(L— Lo) (Hook’s law for rubber) 


The ideal gas temperature scale: As the above example indicates, the zeroth law 
merely states the presence of isotherms. In order to set up a practical temperature scale 
at this stage, a reference system is necessary. The ideal gas occupies an important place 
in thermodynamics and provides the necessary reference. Empirical observations indicate 
that the product of pressure and volume is constant along the isotherms of any gas that is 
sufficiently dilute. The ideal gas refers to this dilute limit of real gases, and the ideal gas 
temperature is proportional to the product. The constant of proportionality is determined 
by reference to the temperature of the triple point of the ice-water—gas system, which 
was set to 273.16 degrees Kelvin (°K) by the 10” General Conference on Weights and 
Measures in 1954. Using a dilute gas (i.e. as P — 0) as thermometer, the temperature of 


a system can be obtained from 


T(°K) = 273.16 x ( Jim (PV) systom/ Jim (PV)ice—watereas} (1.10) 


I.C The First law 


We now consider transformations between different equilibrium states. Such transfor- 
mations can be achieved by applying work or heat to the system. The first law states that 
both work and heat are forms of energy, and that the total energy is conserved. We shall 
use the following formulation: 

e The amount of work required to change the state of an otherwise adiabatically isolated 
system depends only on the initial and final states, and not on the means by which the 
work is performed, or on the intermediate stages through which the system passes. 

As a consequence, we conclude the existence of another state function, the internal 
energy, E(X). Up to a constant, F(X) can be obtained from the amount of work AW 


needed for an adiabatic transformation from an initial state Xj; to a final state X¢, using 
AW = E(X¢) — E(Xj). (1.11) 


In a generic (non-adiabatic) transformation, the amount of work does not equal to the 
change in the internal energy. The difference AQ = AE — AW is defined as the heat 
intake of the system from its surroundings. Clearly in such transformations, AQ and 
AW are not separately functions of state, in that they depend on external factors such 
as the means of applying work, and not only on the final states. To emphasize this, for a 


differential transformation we write 
dQ = dE — dW, (1.12) 


where dE = 5°,0;EdX; can be obtained by differentiation, while dQ and dW generally 
can not. Also note the convention that the signs of work and heat are chosen to indicate 
the energy added to the system, and not vice versa. 

A quasi-static transformation is one that is performed sufficiently slowly so that the 
system is always in equilibrium. Thus at any stage of the process, the thermodynamic 
coordinates of the system exist and can in principle be computed. For such transformations, 
the work done on the system (equal in magnitude but opposite in sign to the work done 
by the system) can be related to changes in these coordinates. Typically one can divide 
the state functions {X} into a set of generalized displacements {x}, and their conjugate 


generalized forces {J}, such that for an infinitesimal quasi-static transformation 


4 


Table [1] provides some common examples of such coordinates. Note that the displacement 
is usually an extensive quantity, i.e. proportional to system size, while the forces are 
intensive and independent of size. Also note that pressure is by convention calculated 
from the force exerted by the system on the walls, as opposed to the force on a spring 
which is exerted in the opposite direction. This is the origin of the negative sign that 


usually accompanies hydrostatic work. 


Magnetization 
Polarization —P 
Chemical Reaction Particle Number NV 


Table 1: Generalized Forces and Displacements 


Joule’s Free Expansion Experiment: Another important property of the ideal gas 
is the behavior of its internal energy. Observations indicate that if such a gas expands 
adiabatically (but not necessarily quasi-statically), from a volume V; to V¢, the initial and 
final temperatures are the same. Since the transformation is adiabatic (AQ = 0) and 
there is no external work done on the system (AW = 0), the internal energy of the gas 
is unchanged. Since the pressure and volume of the gas change in the process, but its 
temperature does not, we conclude that the internal energy depends only on temperature, 
ie. E(V,T) = E(T). This property of the ideal gas is in fact a consequence of the form of 
its equation of state as will be proved in test 1 review problems. 

Response functions are the usual method for characterizing the macroscopic behavy- 
ior of a system. They are experimentally measured from the changes of thermodynamic 
coordinates with external probes. Some common response functions are: 

Heat Capacities are obtained from the change in temperature upon addition of heat to the 
system. Since heat is not a function of state, the path by which it is supplied must also be 
specified. For example, for a gas we can calculate the heat capacities at constant volume 
or pressure, denoted by Cy = dQ/dT|,, and Cp = dQ/dT|p respectively. The latter is 


larger since some of the heat is used up in the work done in changes of volume: 


C52 GE) AON ee CEI 1 OF 
ae ae — |b ae la OP le: Men 
Cp 12) _ @B=dW) _dE+Pav| _ OE) | OV) 
dT'| » qr |p a Ne Ol pe Ons 


Force Constants measure the (infinitesimal) ratio of displacement to force and are gener- 
alizations of the spring constant. Examples include the isothermal compressibility of a gas 
Kr = — OV/OP|,/V, and the susceptibility of a magnet yr = OM/OB|,/V. From the 
equation of state of an ideal gas PV « T, we obtain Kp = 1/P. 
Thermal Responses probe the change in the thermodynamic coordinates with temperature. 
For example, the expansivity of a gas is given by ap = OV/OT|,/V, which equals 1/T 
for the ideal gas. 

Since the internal energy of an ideal gas depends only on its temperature, OF /OT|,, = 
OE /0T|p = dE/dT, and eq.(1.14) simplifies to 


Cp-Cy =P= = PVap = — =Nkp. (1.15) 


The last equality follows from extensivity: for a given amount of ideal gas, the constant 
PV/T is proportional to N, the number of particles in the gas; the ratio is Boltzmann’s 


constant with a value of kg = 1.4 x 10777 J°K—!. 


I.D The Second Law 


The historical development of thermodynamics follows the industrial revolution in the 
19°” century, and the advent of heat engines. It is interesting to see how such practical 
considerations as the efficiency of engines can lead to abstract ideas like entropy. 

An idealized heat engine works by taking in a certain amount of heat Qy, from a 
heat source (for example a coal fire), converting a portion of it to work W, and dumping 
the remaining heat Qc into a heat sink (e.g. atmosphere). The efficiency of the engine is 


calculated from 


ae Qu Qu 


An idealized refrigerator is like an engine running backward, i.e. using work W to 


Wa 2H Ye cy (1.16) 


extract heat Qc from a cold system, and dumping heat @y at a higher temperature. We 


can similarly define a figure of merit for the performance of a refrigerator as 


ics (1.17) 
W QH- Qc 

The first law rules out so called ‘perpetual motion machines of the first kind’, i.e. 
engines that produce work without consuming any energy. However, the conservation of 
energy is not violated by an engine that produces work by converting water to ice. Such 
a ‘perpetual motion machine of the second kind’ would certainly solve the world’s energy 
problems, but is ruled out by the second law of thermodynamics. The observation that 
the natural direction for the flow of heat is from hotter to colder bodies is the content of 
the second law of thermodynamics. There are a number of different formulations of the 
second law, such as the following two statements. 
e Kelvin’s Statement: No process is possible whose sole result is the complete conversion 
of heat into work. 
e Clausius’s Statement: No process is possible whose sole result is the transfer of heat 
from a colder to a hotter body. 

A perfect engine is ruled out by the first statement, a perfect refrigerator by the 
second. In fact the two statements are equivalent as shown below. 
Proof of the equivalence of the Kelvin and Clausius statements proceeds by showing that 
if one is violated so is the other. 

(a) Let us assume that there is a machine that violates Clausius’s statement by taking 

heat @ from a cooler temperature Tc to a higher temperature Ty. Now consider 


an engine operating between these two temperature, taking heat Qy from Ty and 


i 


dumping Qo at Tc. The combined system takes Qy — Q from Ty, produces work 
equal to Qy —Qc and dumps Qc —Q at Tc. If we adjust the engine output such that 
Qo = Q, the net result is a 100% efficient engine, in violation of Kelvin’s statement. 
(b) Alternatively, assume a machine that violates Kelvin’s law by taking heat Q and 
converting it completely to work. The work output of this machine can be used to 
run a refrigerator, with the net outcome of transferring heat from a colder to a hotter 
body, in violation of Clausius’s statement. 

Although these statements may appear as rather trivial and qualitative descriptions, 


they have important quantitative implications as demonstrated in the next sections. 


I.E Carnot Engines & Thermodynamic Temperature 


e A Carnot Engine is any engine that is reversible, runs in a cycle, with all of its heat 
exchanges taking place at a source temperature Ty, and a sink temperature Tc. 

A reversible process is one that can be run backward in time by simply reversing its 
inputs and outputs. It is the thermodynamic equivalent of frictionless motion in mechanics. 
Since time reversibility implies equilibrium, a reversible transformation must be quasi- 
static, but the reverse is not necessarily true (e.g., if there is energy dissipation due to 
friction). An engine that runs in a cycle returns to its original internal state at the end of 
the process. The distinguishing characteristic of the Carnot engine is that heat exchanges 
with the surroundings are carried out only at two temperatures. The zeroth law allows us 
to select two isotherms at temperatures Ty and To for these heat exchanges. To complete 
the Carnot cycle we have to connect these isotherms by reversible adiabatic paths in the 
coordinate space. Since heat is not a function of state, we don’t know how to construct 
such paths in general. Fortunately, we have sufficient information at this point to construct 
a Carnot engine using an ideal gas as its internal working substance. For the purpose of 
demonstration, let us compute the adiabatic curves for a monatomic ideal gas with an 


internal energy 


3 3 
E=—-NkpT = =PV. 
ghee ae 
Along a quasi-static path 
3 5 3 
dQ = dE —dW = d(5PV) + PdV = gPav + 3 VaP. (1.18) 


The adiabatic condition dQ = 0, then implies a path 


—+-—=0, = PV” =constant, (1.19) 


with y = 5/3. The adiabatic curves are clearly distinct from the isotherms, and we can 
select two such curves to intersect our isotherms, thereby completing a Carnot cycle. The 
assumption of EF x T is not necessary, and in the test 1 review problems you will construct 
adiabatics for any E(T). In fact, a similar construction is possible for any two parameter 
system with E(J, x). 

e Carnot’s Theorem: No engine operating between two reservoirs (at temperatures Ty 
and Tc) is more efficient than a Carnot engine operating between them. 

Proof: Since a Carnot engine is reversible, it can be run backward as a refrigerator. 
Use the non-Carnot engine to run the Carnot engine backward. Let us denote the heat 
exchanges of the non-Carnot and Carnot engines by Qy, Qc, and Q4,, Qo, respectively. 
The net effect of the two engines is to transfer heat equal to Qu — Q'y = Qc — Qo from 
Ty to Tc. According to Clausius’s statement, the quantity of transferred heat cannot be 
negative, i.e. Qy > Q‘y. Since the same quantity of work W, is involved in this process, 


we conclude that 
W W 


ee 
Qu” VF 


Corollary: All reversible (Carnot) engines have the same universal efficiency 7(Ty, Tc), 


= Carnot 2 1non—Carnot: (1.20) 


since each can be used to run the other one backward. 

The Thermodynamic Temperature Scale: As shown earlier, it is at least theoretically 
possible to construct a Carnot engine using an ideal gas (or any other two parameter 
system) as working substance. We now find that independent of the material used, and 
design and construction, all such cyclic and reversible engines have the same maximum 
theoretical efficiency. Since this maximum efficiency is only dependent on the two tem- 
peratures, it can be used to construct a temperature scale. Such a temperature scale has 
the attractive property of being independent of the properties of any material (e.g. the 
ideal gas). To construct such a scale we first find out how (Ty, 7c) depends on the two 
temperatures. Consider two Carnot engines running in series, one between temperatures 
T, and T>, and the other between T> and T3 (T, > T> > T3). Denote the heat exchanges, 
and work outputs, of the two engines by Q1, Qo, Wi2, and Qo, Q3, W23 respectively. Note 
that the heat dumped by the first engine is taken in by the second, so that the combined 
effect is another Carnot engine (since each component is reversible) with heat exchanges 


Qi, Q3, and work output W713 = Wy2 + Wo3. The three heats are related by 


Q2 = Q1 — Wie = Qill — (Th, T2)), 


9 


Qs = Qe — Was = Qo[1 — n(T2, Ts)] = Qi[l — n(T1, T2)][1 — n(Z2, T3)], 
Q3 = Q1 — Wiz = Qea[l — (Th), T3)]- 
Comparison of the final two expressions yields 
[1 — n(Zi, T3)] = [1 — n(T1, Fa)][1 — 9 (Za, T3)). (1.21) 


This property implies that 1—7(71, T2) can be written as a ratio of the form f(T2)/f(T1), 


which by convention is set to T>/T}, i.e. 


T: 
1 — (Ti, To) os a 
is Ba (1.22) 
H—4+C 


Eq.(I.22) defines temperature up to a constant of proportionality, which is again set by 
choosing the triple point of water, ice, and steam to 273.16°K. Throughout this chapter, 
I have used the symbols © and JT’ interchangeably. In fact, by running a Carnot cycle 
for a perfect gas, it can be proved (see test 1 review problems) that the ideal gas and 
thermodynamic temperature scales are equivalent. Clearly the thermodynamic scale is not 
useful from a practical stand-point; its advantage is conceptual, in that it is independent 
of the properties of any substance. All thermodynamic temperatures are positive, since 
according to eq.(I.22) the heat extracted from a temperature T' is proportional to it. If a 
negative temperature existed, an engine operating between it and a positive temperature 
would extract heat from both reservoirs and convert the sum total to work, in violation of 


Kelvin’s statement of the second law. 


I.F Entropy 


The following theorem allows us to construct another function of state using the second 
law. 
e Clausius’s Theorem: For any cyclic transformation (reversible or not), ¢dQ/T < 0, 
where dQ refers to the heat increment supplied to the system at temperature T. 
Proof: We can subdivide the cycle into a series of infinitesimal transformations in which 
the system receives energy in the form of heat dQ and work dW. The system need not be 
in equilibrium at each interval. Direct all the heat exchanges of the system to one port 


of a Carnot engine, which has another reservoir at a fixed temperature TJ. Since the sign 


10 


of dQ is not specified, the Carnot engine must operate a series of infinitesimal cycles in 
either direction. To deliver heat dQ to the system at some stage, the engine has to extract 
heat dQ rR from the fixed reservoir. If the heat is delivered to a part of the system which is 


locally at a temperature T, then according to eq.(I.22), 


dQ 


dQr = To (1.23) 


After the cycle in completed, the system and the Carnot Engine return to their original 
states. The net effect of the combined process is extracting heat Qe = ¢dQr from the 
reservoir and converting it to external work W. The work W = Q@ is the sum total of the 
work elements done by the Carnot engine, and the work performed by the system in the 


complete cycle. By Kelvin’s statement of the second law, Qe = W < 0, ie. 
dQ dQ 
Ton d?—<0, => — <0 1.24 
of eee f Co re) 


since Ty > 0. Note that T in eq.(1.24) refers to the temperature of the whole system 
only for quasi-static processes in which it can be uniquely defined throughout the cycle. 
Otherwise, it is just a local temperature (say at a boundary of the system) at which the 
Carnot engine deposits the element of heat. 

Consequences of Clausius’s theorem: 
(1) For a reversible cycle $dQrey/T = 0, since by running the cycle in the opposite 
direction dQrey > —dQrey, and by the above theorem dQ;ey/T is both non-negative and 
non-positive, hence zero. This result implies that the integral of dQ;-,/T between any two 


points A and B is independent of path, since for two paths (1) and (2) 


Bag? rage? Bag? —r? aq’? 
oe =), => = : (1.25) 
A T B T5 A a A T? 


(2) Using eq.(I.25) we can construct yet another function of state, the entropy S. Since 


the integral is independent of path, and only depends on the two end points, we can set 


B 
S(B) - (A) = [ ae (1.26) 


For reversible processes, we can now compute the heat from dQ;ey = TdS. This allows us 
to construct adiabatic curves for a general (multi-variable) system from the condition of 


constant S. Note that eq.(I.26) only defines the entropy up to an overall constant. 


11 


(3) Consider an irreversible change from A to B. Make a complete cycle by returning from 


B to A along a reversible path. Then 


PaQ [4 dQrev P dQ 
[ S+f mee <Q, = [ Fssw-s. (1.27) 


In differential form, this implies that dS > dQ/T for any transformation. In particular, 
consider adiabatically isolating a number of subsystems, each initially separately in equi- 
librium. As they come to a state of joint equilibrium, since the net dQ = 0, we must have 
0S > 0. Thus an adiabatic system attains a maximum value of entropy in equilibrium since 
spontaneous internal changes can only increase S. The direction of increasing entropy thus 
points out the arrow of time, and the path to equilibrium. 

(4) For a reversible (hence quasi-static) transformation, dQ = TdS and dW = So, Jidxi, 


and the first law implies 


dE =4Q+ dW =TdS+)~ Jidz;. (1.28) 


Although eq.(I.28) was obtained from a reversible transformation, as a relation between 
functions of state it is a generally valid identity of thermodynamics. Also note that in this 
equation S and T appear as conjugate variables, with S playing the role of a displacement, 
and T as the corresponding force. 

(5) The number of independent variables necessary to describe a thermodynamic system 
also follows from eq.(I.28). If there are n methods of doing work on a system, represented 
by n conjugate pairs (J;,7;), then n +1 independent coordinates are necessary to describe 
the system. (We shall ignore possible constraints between the mechanical coordinates.) 


For example, choosing (EF, {z;}) as coordinates, it follows from eq.(1.28) that 


OS 1 OS J; 
OE . = T? and Oe: =—-—. (1.29) 


Ex 545 


(x and J will be used as short-hand notations for the parameter sets {x;} and {J;}.) 


12 


1.G_ Approach to Equilibrium and Thermodynamic Potentials 


Evolution of non-equilibrium systems towards equilibrium is governed by the second 
law of thermodynamics. For example, in the previous section we showed that for an adia- 
batically isolated system entropy must increase in any spontaneous change and reaches a 
maximum in equilibrium. What about out of equilibrium systems that are not adiabati- 
cally isolated and which may also be subject to external mechanical work? It is usually 
possible to define other thermodynamic potentials that are extremized when the system is 
in equilibrium. 

Enthalpy is the appropriate function when there is no heat exchange (dQ = 0), and 
the system comes to mechanical equilibrium with a constant external force. The minimum 
enthalpy principle merely formulates the observation that stable mechanical equilibrium 
is obtained by minimizing the net potential energy of the system plus the external agent. 
For example, consider a spring of natural extension Lp and spring constant K, subject to 
the force exerted by a particle of mass m. For an extension x = L— Lo, the internal energy 
of the spring is Kx?/2, while there is a change of —mgzx in the potential energy of the 
particle. Mechanical equilibrium is obtained by minimizing Kx?/2—mgz at an extension 
Leq = mg/K. The spring at any other value of the displacement initially oscillates before 
coming to rest at xq due to friction. For general displacements x, at constant generalized 
forces J, the work input to the system is dW < J-6x. (Equality is achieved for a reversible 
change, but there is generally some loss of the external work into friction.) Since dQ = 0, 


using the first law, dE < J - 6x, and 
6H <0, where H=E-J-x (1.30) 
is the enthalpy. The variations of H in equilibrium are given by 
dH =dE-—d(J-x) =TdS+J-dx—x-dJ-—J-dx=TdS—x-dJ . (1.31) 


The equality in eq.(1.31), and the inequality in eq.(I.30), are a possible source of confusion. 
Note that eq.(I.30) refers to variations of H on approaching equilibrium as some parameter 
that is not a function of state is varied (e.g. the velocity of the particle joined to the spring 
in the above example). By contrast eq.(I.31) describes a relation between equilibrium 
coordinates. To differentiate the two cases, I will denoted the former non-equilibrium 


variations by 0. 


13 


The coordinate set (S,J) is the natural choice for describing the enthalpy, and it 
follows from eq.(I.31) that 
(E32) 


Variations of the enthalpy with temperature are related to heat capacities at constant 


force, for example 


ase 


_ dE + Pdv 
-4/)-97— 


S dT 


d(E + PV) 


. dT 


_ dH 


Gs Lee 
i ae 


(1.33) 


~ 
Note, however, that a change of variables is necessary to express H in terms of 7’, rather 
than the more natural variable S. 

Helmholtz Free energy is useful for isothermal transformations in the absence of 


mechanical work (€W = 0). From Clausius’s theorem, the heat intake of a system at a 
constant temperature T' satisfies dQ < T6S. Hence DF =dQ+dW < ToS, and 


6F <0, where F=E-TS (1.34) 
is the Helmholtz free energy. Since 
dF = dE —d(TS)=TdS+J-dx — SdT —TdS = —SdT + J - dx, (1.35) 


the coordinate set (T,x) (the quantities kept constant during an isothermal transformation 
with no work) is most suitable for describing the free energy. The equilibrium forces and 


entropy can be obtained from 


OF OF 
a , S=- =|. 1.36 
Ui At = 
The internal energy can also be calculated from F' using 
OF O(F/T) 
£3 RG 5 payed ees eee I. 
E=F+TS aT |. aT | (1.37) 


Gibbs Free Energy applies to isothermal transformations involving mechanical work 
at constant external force. The natural inequalities for work and heat input into the system 
are given by dW < J-6x anddQ < T0S. Hence 6E < T6S + J - 6x leading to 


6G<0, where G=E-TS-—J-x (1.38) 


14 


is the Gibbs free energy. Variations of G are given by 
dG = dE—d(TS)—d(J-x) = TdS+J-dx—SdT—TdS—x-dJ—J-dx = —SdT—x-dJ, (1.39) 


and most easily expressed in terms of (T, J). 


| 
oF <0_| 
5 <0 | 


Table 2: Inequalities satisfied by thermodynamic potentials. 


Table (2) summarizes the above results on thermodynamic functions. Eqs.(1.30), 
(1.34), and (1.38) are examples of Legendre transformations, used to change variables 
to the most natural set of coordinates for describing a particular situation. So far, we 
implicitly assumed a constant number of particles in the system. In chemical reactions, 
and in equilibrium between two phases, the number of particles in a given constituent 
may change. The change in the number of particles necessarily involves changes in the 
internal energy, which is expressed in terms of a chemical work dW = w-dN. Here 
N = {Nj, No,---} lists the number of particles of each species, and psp = {/11, f2,---} the 
associated chemical potentials which measure the work necessary to add additional particles 
to the system. Traditionally, chemical work is treated differently from mechanical work and 
is not subtracted from F in the Gibbs free energy of eq.(1.38). For chemical equilibrium 
in circumstances that involve no mechanical work, the appropriate state function is the 
Grand Potential given by 

G=E-TS—-p-N . (1.40) 


G(T, uw, x) is minimized in chemical equilibrium, and its variations in general satisfy 
dG = -—SdT—J-dx-N-dp . (1.41) 


Example: To illustrate the concepts of this section, consider N particles of supersaturated 
steam in a container of volume V at a temperature T’. How can we describe the approach 
of steam to an equilibrium mixture with N,, particles in the liquid and N, particles in the 


gas phase? The fixed coordinates describing this system are V, JT’, and N. The appropriate 


15 


thermodynamic function from Table (2) is the Helmholtz free energy F(V,7T,N), whose 
variations satisfy 
dF = d(E —TS) = —SdT — PdV + pdN. (1.42) 


Before the system reaches equilibrium at a particular value of N,,, it goes through a series 
of non-equilibrium states with smaller amounts of water. If the process is sufficiently slow, 


we can construct an out of equilibrium value for F’ as 
F(V,T, N|Nw) = Fu(T, Nw) + Fs (V,T, N -— Nw), (1.43) 


which depends on an additional variable N,,. (It is assumed that the volume occupied by 
water is small and irrelevant.) According to eq.(I.34), the equilibrium point is obtained by 


minimizing F' with respect to this variable. Since 


OF 


OF 


OF oe 
TV ON, 


oNw, (1.44) 
TV 


and OF /ON|7\- = w from eq.(1.42), the equilibrium condition can be obtained by equating 
the chemical potentials, i.e. from p1,(V, 7) = us(V, 7). The identity of chemical potentials 
is the condition for chemical equilibrium. Naturally, to proceed further we need expressions 


for pay ane fie. 


I.LH Useful Mathematical Results 


(1) Extensivity: Including chemical work, variations of the extensive coordinates of the 


system are related by (generalizing eq.(1.28)) 
dE =TdS+J-dx+p-dn. (1.45) 


For fixed intensive coordinates, the extensive quantities are simply proportional to size or 


to the number of particles. This proportionality is expressed mathematically by 
E(AS, Ax, AN) = AE(S,x, N). (1.46) 
Evaluating the derivative of the above equation with respect to A at A = 1, results in 


OE dE 
see 7s ae 


os 


Na = E(S,x,N). (1.47) 


x,N S,xj41,N o S,x,Nega 


The partial derivatives in the above equation can be identified from eq.(I.45) as T, J;, and 
Liq respectively. Substituting these values into eq.(I.47) leads to the so called fundamental 


equation of thermodynamics 
B=TS+JI-x+yu-N. (1.48) 


Combining the variations of eq.(1.48) with eq.(1.45) leads to a constraint between the 


variations of intensive coordinates 
SdT+x-dJ+N-du=0, (1.49) 


known as the Gibbs—Duhem relation. 
Example: For a fixed amount of gas (dN = 0), variations of the chemical potential along 
an isotherm can be calculated as follows. Since dT’ = 0, the Gibbs-Duhem relation gives 
—VdP + Ndp = 0, and 

di = ap = pe (1.50) 

N P 

where we have used the ideal gas equation of state PV = NkgpT. Integrating the above 
equation gives 


Pe V 
bM=pot+keT ln— = po —kpTIn—, (1.51) 
Po Vo 


where (Po, Vo, Wo) refer to the coordinates of a reference point. 

(2) Maxwell’s Relations: Combining the mathematical rules of differentiation with 
thermodynamic relationships leads to several useful results. The most important of 
these are Maxwell’s relations which follow from the commutative property [0,0, f(x,y) = 


0,0zf (x, y)| of derivatives. For example, it follows from eq.(1.45) that 


ee = 7. “ond = ile (1.52) 
OS | Ox; S,x;4:,N 
The joint second derivative of E is then given by 
OPE _ OE _ oT! _ dh (1.53) 


Since (Oy/Ox) = (Ox/Oy)~', the above equation can be inverted to give 


as 
OJ; 


- 1.54 
Ra (1.54) 


S 


ile 


Similar identities can be obtained from the variations of other state functions. Sup- 
posing that we are interested in finding an identity involving 0S/0zx|,-.. We would like 
to find a state function whose variations include SdT and Jdx. The correct choice is 
dF = d(E—TS) = —SdT + Jdz. Looking at the second derivative of F’ yields the Maxwell 


relation 


_ a8] _ as 
Oba. © 200 et 


To calculate 0.5/0J|, consider d(E -TS—Jx) = —SdT —axdJ, which leads to the identity 


(1.55) 


Os 
OJ 


_ Ox 


= (1.56) 
Lor 


i 
There are a variety of mnemonics which are supposed to help you remember and construct 
Maxwell’s equations, such as Magic Squares, Jacobians, etc. I personally don’t find any of 
these methods worth learning. The most logical approach is to remember the laws of ther- 
modynamics and hence eq.(I.28), and to then manipulate it so as to find the appropriate 
derivative using the rules of differentiation. 

Example: To obtain 0u/0P|, 7 for an ideal gas, start with d(H —T'S + PV) = —SdT + 
VdP + pdN. Clearly 


Ou OV V kp 
eg = __ = — = — L. 
Feige ON Re ee 
as in eq.(1.50). From eq.(1.28) it also follows that 
as P OE /OV 
O8|) _2eF o£ OE sw (1.58) 
Vipn T JE/OS\y x 


where we have used eq.(I.45) for the final identity. The above equation can be rearranged 


into 
as 
OV 


OE 
pn OS 


OV 


a} = -1, (1.59) 
V,N OE 


S,N 


which is an illustration of the chain rule of differentiation. 


18 


II Stability Conditions 


The conditions derived in section I.G are similar to the well known requirements for 
mechanical stability. A particle moving in an external potential U settles to a stable 
equilibrium at a minimum value of U. In addition to the vanishing of the force —U’, this 
is a consequence of the loss of energy to frictional processes. Stable equilibrium occurs at 
a minimum of the potential energy. For a thermodynamic system, equilibrium occurs at 
the extremum of the appropriate potential, for example at the maximum value of entropy 
for an isolated system. The requirement that spontaneous changes should always lead 
to an increased entropy, places important constraints on equilibrium response functions, 
discussed in this section. 

Consider a homogeneous system at equilibrium, characterized by intensive state func- 
tions (T, J, 44), and extensive variables (EF, x, N). Now imagine that the system is arbitrar- 
ily divided into two equal parts, and that one part spontaneously transfers some energy 
to the other part in the form of work or heat. The two subsystems, A and B, initially 
have the same values for the intensive variables, while their extensive coordinates satisfy 
E,t+Ep=E,x,4+xXp =x, andN,+Npg=N. After the exchange of energy between 


the two subsystems, the coordinates of A change to 

(F4+6EF, x4 +6x, N4a+0o6N), and (T4+6T4, Ja +634, wat Oma), (1.60) 
and those of B to 

(Ep — OF, xp — 6x, Ng —ON), and (Tg +67Tp, Jgp+6JB, up + opp). (1.61) 


Note that the overall system is maintained at constant E, x, and N. Since the inten- 
sive variables are themselves functions of the extensive coordinates, to first order in the 


variations of (£,x,N), we have 
6T4 = —0Tp =OT, b63Ja=-—dJp=oJ, Ova = —Oup = Op. (1.62) 


Using eq.(I.48), the entropy of the system can be written as 


Since by assumption we are expanding about the equilibrium point, the first order changes 


vanish, and to second order 


1 
65 = 554+ 6Sp =2 2 (=) 5B4—6 (4) 5x4 —6 (44) oN] (1.64) 


A ‘A TA 


(We have used eq.(I.62) to note that the second order contribution of B is the same as A.) 
Eq.(1.64) can be rearranged to 


2 6B 4 —J4- 0x, — -ON 
ig=- 2 [ors (ace Seta) pee + bya 5] 
A (1.65) 
=— 7 [6TadSa + bS.4- 0x4 + Oa: ONa]. 
A 


The condition for stable equilibrium is that any change should lead to a decrease in entropy, 
and hence we must have 


6TdS + 0J-dx+du-dN > 0. (1.66) 


We have now removed the subscript A, as the above condition must apply to the whole 
system as well as to any part of it. 

The above condition was obtained assuming that the overall system was kept at con- 
stant #, x, and N. In fact, since all coordinates appear symmetrically in this expression, 
the same result is obtained for any other set of constraints. For example, variations in 6T 
and 6x with ON = 0, lead to 


5r| 8 + ae ee a 
pp jes, oe 
‘ OT Ox; Tp : 
Substituting these variations into eq.(1.66) leads to 
Os OJ; 
=| (6T)? - ix; > 0. ie 
ar|_ ) ars ee (1.68) 


Note that the cross terms proportional to d7’6x; cancel due to the Maxwell relation in 
eq.(1.56). Eq.(1.68) is a quadratic from, and must be positive for all choices of dT and 6x. 
The resulting constraints on the coefficients are independent of how the system was initially 


partitioned into subsystems A and B, and represent the conditions for stable equilibrium. 


20 


If only dT is non-zero, eq.(I.66) requires 0S/0T|, > 0, implying a positive heat capacity, 


since 


Cx > 0. (1.69) 


x 


_ dQ} _,, 0s 
ro 


aaa a 

If only one of the 6x; in eq.(1.66) is non-zero, the corresponding response function 
Ox; / OSilj24 must be positive. However, a more general requirement exists since all 6x 
values may be chosen non-zero. The general requirement is that the matrix of coefficients 
OJ;/Ox;|,-, must be positive definite. A matrix is positive definite if all of its eigenvalues 
are positive. It is necessary, but not sufficient, that all the diagonal elements of such a 


matrix (the inverse response functions) be positive, leading to further constraints between 


the response functions. Including chemical work for a gas, the appropriate matrix is 


= OP = OP 
aV IT,N NIT,V 
Din ‘Oli : (1.70) 
BV | ON | ny 
In addition to the positivity of the response functions k7.y = —V~! OV/OP|p n and 


ON/Op\7y, the determinant of the matrix must be positive, requiring 


OP 
aN 


on 
ry OV 


ms 
rw OV 


Ou 


Sel, BG: a7) 
mg ON 


TLV 


Another interesting consequence of eq.(I.66) relates to the critical point of a gas where 


OP/OV|\p, 7 =90. Assuming that the critical isotherm can be analytically expanded as 


1.02? 
OV + 


— es OVP He, (1.72) 
ae 2 AV? 


Te,N 6 OV? 


the stability condition —dPdV > 0 implies that OP OV! sg must be zero, and the third 
derivative negative, if the first derivative vanishes. This condition is used to obtain the 
critical point of the gas from mean-field approximations to the isotherms (such as the van 
der Waals isotherms). In fact, it is usually not justified to make a Taylor expansion around 


the critical point as in eq.(I.72), although the constraint —dPdV > 0 remains applicable. 


21 


I.J The Third Law 


Differences in entropy between two states can be computed using the second law, 
from AS = [dQ,.,/T. Low temperature experiments indicate that AS(X,T) vanishes as 
T goes to zero for any set of coordinates X. This observation is independent of the other 
laws of thermodynamics, leading to the formulation of a third law by Nernst, which states 
e The entropy of all systems at zero absolute temperature is a universal constant that can 
be taken to be zero. 


T 0 S( ) ) d ( ) 


which is a stronger requirement than the vanishing of the differences AS(X,T). This 
extended condition has been tested for metastable phases of a substance. Certain materials 
such as sulphur or phosphine can exist in a number of rather similar crystalline structures 
(allotropes). Of course, at a given temperature only one of these structures is truly stable. 
Let us imagine that as the high temperature equilibrium phase A, is cooled slowly, it makes 
a transition at a temperature 7* to phase B, releasing latent heat L. Under more rapid 
cooling conditions the transition is avoided, and phase A persists in metastable equilibrium. 
The entropies in the two phases can be calculated by measuring the heat capacities C'4(T) 
and C'p(T). Starting from T = 0, the entropy at a temperature slightly above T* can be 
computed along the two possible paths as 


S(T* +) = S4(0) + [ ar a) 650) [ apa 244) ayy 


Such measurements have indeed verified that S4(0) = Sp(0) = 0. 
Consequences of the third law: 


(1) Since S(T = 0, X) = 0 for all coordinates X, 


=0. (1.75) 


(2) Heat capacities must vanish as T — 0 since 


Cx(T") 


S(T, X) — 8(0,X) = [ ap. 


(1.76) 


22 


and the integral diverges as T’ — 0 unless 
iim Cx(T) = 0. (1.77) 


(3) Thermal expansivities also vanish as T’ — 0 since 


das 


al = 5 3G (1.78) 
J 


a 
The second equality follows from the Maxwell relation in eq.(1.56). The vanishing of the 
latter is guaranteed by eq.(1.75). 
(4) It is impossible to cool any system to absolute zero temperature in a finite number of 
steps. For example, we can cool a gas by an adiabatic reduction in pressure. Since the 
curves of S versus T’ for different pressures must join at T’ = 0, successive steps involve 
progressively smaller changes, in S and in 7’, on approaching zero temperature. Alterna- 
tively, the unattainability of zero temperatures implies that S(T’ = 0, P) is independent 
of P. This is a weaker statement of the third law which also implies the equality of zero 
temperature entropy for different substances. 

In the following sections, we shall attempt to justify the laws of thermodynamics from 
a microscopic point of view. The first law is clearly a reflection of the conservation of 
energy, which also operates at the microscopic level. The zeroth and second laws suggest 
an irreversible approach to equilibrium, a concept that has no analog at the particulate 
level. It is justified as reflecting the collective behavior of large numbers of degrees of 
freedom. In statistical mechanics the entropy is calculated as S = kg lng, where g is the 
degeneracy of the states (number of configurations with the same energy). The third law 
of thermodynamics thus requires that g = 1 at T = 0, i.e. that the ground state of any 
system is unique. This condition does not hold within the framework of classical statisti- 
cal mechanics, as there are examples of both non-interacting (such as an ideal gas), and 
interacting (the frustrated spins in a triangular antiferromagnet) systems with degenerate 
ground states, and a finite zero temperature entropy. However, classical mechanics is inap- 
plicable at very low temperatures and energies where quantum effects become important. 
The third law is then equivalent to the statement that the ground state of a quantum 
mechanical system is unique. While this can be proved for a non-interacting system, there 
is no general proof of its validity with interactions. Unfortunately, the onset of quantum 
effects (and other possible origins of the breaking of classical degeneracy) are system spe- 


cific. Hence it is not a priori clear how low the temperature must be, before the predictions 


23 


of the third law can be observed. Another deficiency of the law is its inapplicability to 
glassy phases. Glasses results from the freezing of supercooled liquids into configurations 
with extremely slow dynamics. While not truly equilibrium phases (and hence subject to 
all the laws of thermodynamics), they are effectively so due to the slowness of the dynam- 
ics. A possible test of the applicability of the third law to glasses, is discussed in test #1 


review problems. 


24 


II. Probability 


IIl.A General Definitions 


The laws of thermodynamics are based on observations of macroscopic bodies, and 
encapsulate their thermal properties. On the other hand, matter is composed of atoms 
and molecules whose motions are governed by fundamental laws of classical or quantum 
mechanics. It should be possible, in principle, to derive the behavior of a macroscopic 
body from the knowledge of its components. This is the problem addressed by kinetic 
theory in the following lectures. Actually, describing the full dynamics of the enormous 
number of particles involved is quite a daunting task. As we shall demonstrate, for dis- 
cussing equilibrium properties of a macroscopic system, full knowledge of the behavior of 
its constituent particles is not necessary. All that is required is the likelihood that the 
particles are in a particular microscopic state. Statistical mechanics is thus an inherently 
probabilities description of the system. Familiarity with manipulations of probabilities is 
therefore an important prerequisite to statistical mechanics. Our purpose here is to review 
some important results in the theory of probability, and to introduce the notations that 
will be used in the rest of the course. 

The entity under investigation is a random variable x, which has a set of possible 
outcomes S = {x1,X2,--:}. The outcomes may be discrete as in the case of a coin toss, 
Scoin = {head, tail}, or a dice throw, Sgice = {1,2,3,4,5,6}, or continuous as for the 
velocity of a particle in a gas, Sy = {—00 < Uz, Vy, Uz < oo}, or the energy of an electron 
in a metal at zero temperature, S, = {0 <e<epr}. An event is any subset of outcomes 
E CS, and is assigned a probability p(E), e.g. Pdice({1}) = 1/6, or Paice({1,3}) = 1/3. 
From an axiomatic point of view, the probabilities must satisfy the following conditions: 

(i) Positivity: p(E) > 0, i.e. all probabilities must be non-zero. 
(ii) Additivity: p(A or B) = p(A) + p(B), if A and B are disconnected events. 
(iii) Normalization: p(S) = 1, i.e. the random variable must have some outcome in S. 

From a practical point of view, we would like to know how to assign probability values 
to various outcomes. There are two possible approaches: 

(1) Objective probabilities are obtained experimentally from the relative frequency of the 
occurrence of an outcome in many tests of the random variable. If the random process 


is repeated N times, and the event A occurs N,4 times, then 


For example, a series of N = 100, 200, 300 throws of a dice may result in Ny = 
19, 30, 48 occurrences of 1. The ratios .19, .15, .16 provide an increasingly more 
reliable estimate of the probability pgice({1}). 

(2) Subjective probabilities provide a theoretical estimate based on the uncertainties 
related to lack of precise knowledge of outcomes. For example, the assessment 
Paice({1}) = 1/6, is based on the knowledge that there are six possible outcomes 
to a dice throw, and that in the absence of any prior reason to believe that the dice is 
biased, all six are equally likely. All assignments of probability in Statistical Mechanics 
are subjectively based. The consequences of such subjective assignments of probability 
have to be checked against measurements, and they may need to be modified as more 


information about the outcomes becomes available. 


IIl.B One Random Variable 


As the properties of a discrete random variable are rather well known, here we focus 
on continuous random variables, which are more relevant to our purposes. Consider a 
random variable x, whose outcomes are real numbers, i.e. S; = {—co < x < oo}. 
e The cumulative probability function (CPF) P(x), is the probability of an outcome with 
any value less than x, i.e. P(x) = prob.(E C [—co,2]). P(x) must be a monotonically 
increasing function of x, with P(—oo) = 0 and P(+oo) = 1. 
e The probability density function (PDF) is defined by p(x) = dP(x)/dx. Hence, p(x)dx = 
prob.(£ C [x,x + dz]). As a probability density, it is positive, and normalized such that 


CO 


prob.(S) = dep) = 1, (II.1) 


—Co 


Note that since p(x) is a probability density, it has no upper bound, i.e. 0 < p(a) < oo. 


e The expectation value of any function F(x), of the random variable is 


CEG) r= - dx p(x)F (x) . (11.2) 


un, @) 


The function F(x) is itself a random variable, with an associated PDF of pr(f)df = 
prob.(F'(x) C |f, f + df]). There may be multiple solutions x;, to the equation F(x) = f, 
and 
(IL.3) 


The factors of |dx/dF'| are the Jacobians associated with the change of variables from x 
to F. For example, consider p(x) = \exp(—A|z|)/2, and the function F(x) = x?. There 
are two solutions to F(x) = f, located at rz = +\/f, with corresponding Jacobians 
| + f-1/2/2]. Hence, 


1 —1 rv =) 
Pet 3o (07) (leq|* bal) <a 
for f > 0, and pr(f) = 0 for f < 0. Note that pr(f) has an (integrable) divergence at 
=O: 


e Moments of the PDF are expectation values for powers of the random variable. The n™” 


Nm] > 


moment is 
iy Se y= [expo es (11.4) 


e The characteristic function, is the generator of moments of the distribution. It is simply 


the Fourier transform of the PDF, defined by 


De y= (enn = [ee@ ee, (11.5) 


The PDF can be recovered from the characteristic function through the inverse Fourier 


transform 
1 lod tka 


Moments of the distribution are obtained by expanding p(k) in powers of k, 
y <. (—ik)" _,, ~ (-ik)" , ,, 
He (> ce = Veg (a). (IL7) 


Moments of the PDF around any point x9 can also be generated by expanding 


eth 05 (f) = (Cea. = 2 (-tk)" dG — xo)”) : (11.8) 


! 
7=0 nN 


e The cumulant generating function is the logarithm of the characteristic function. Its 


expansion generates the cumulants of the distribution defined through 


Inp(k) => fhe (c”) (1.9) 


Relations between moments and cumulants can be obtained by expanding the logarithm 


of p(k) in eq.(II.7), and using 
n(1 +) oe yee (11.10) 


The first four cumulants are called the mean, variance, skewness, and curtosis of the 


distribution respectively, and are obtained from the moments as 
(x), = (2), 
(2°), =(@") - 
(a?) , = (a) — 8 (2?) (2) +2 (a) 
(at), = (at) — 4 (2) (w) — 8 (x?) + 12 (2) (x)? — 6 (a)*. 


The cumulants provide a useful and compact way of describing a PDF. 


(11.11) 


An important theorem allows easy computation of moments in terms of the cumulants: 
Represent the n*® cumulant graphically as a connected cluster of n points. The m*? moment 
is then obtained by summing all possible subdivisions of m points into groupings of smaller 
(connected or disconnected) clusters. The contribution of each subdivision to the sum is 
the product of the connected cumulants that it represents. Using this result the first four 


moments are easily computed as 
(x) = (2)., 
(a?) = (2?) + (eye 
Lye ee +3 (or); (x). + (x)? 
(x4) = (24) +4 (x) (x), +3 (a?) +6(a?), (x)? + (2)3. 


This theorem, which is the starting point for various diagrammatic computations is statis- 


(11.12) 


tical mechanics and field theory, is easily proved by equating the expression in eqs. (II.7) 
and (11.9) for p(k) 


a eee a pee (GY ow 


m=0 n=1 


Equating the powers of (—ik)™ on the two sides of the above expression leads to 


23 Dh eerane So ae (11.14) 


{pn} ” 
The sum is restricted such that }> np, = m, and leads to the graphical interpretation 
given above, as the numerical factor is simply the number of ways of breaking m points 


into {p,} clusters of n points. 


28 


II.C Some Important Probability Distributions 


The properties of three commonly encountered probability distributions are examined 
in this section. 
(1) The normal (Gaussian) distribution describes a continuous real random variable z, 


with 


Haye se a] (11.15) 


The corresponding characteristic function also has a Gaussian form, 


7 oe 1 (x—A)?_ ka? 

Dk) = i dx Tae — -- ike = exp [isa Seer ? (11.16) 
Cumulants of the distribution can be identified from Inp(k) = —ik\ — k?o?/2, using 
eq.(II.9), as 

= a xy Cam =a, (a), = Cay SQ 4 (11.17) 


The normal distribution is thus completely specified by its two first cumulants. This makes 
the computation of moments using the cluster expansion (eqs.(II.12)) particularly simple, 


and 


(x) =, 
(x?) =o? +d’, 
a) = 307+ d3, 
(a) =30* + 6077 + 4, 


(11.18) 


The normal distribution serves as the starting point for most perturbative computations 
in field theory. The vanishing of higher cumulants implies that all graphical computations 
involve only products of one point, and two point (known as propagators) clusters. 

(2) The binomial distribution: Consider a random variable with two outcomes A and B 
(e.g. a coin toss) of relative probabilities p4 and pp = 1— pa. The probability that in N 
trials the event A occurs exactly N4 times (e.g. 5 heads in 12 coin tosses), is given by the 


binomial distribution 
N Na..N—-N 
pn(Na) = es PA DR rs (11.19) 


The prefactor, 


N N! 
ae ~ Nal(N — Na)!’ uny) 


is just the coefficient obtained in the binomial expansion of (p4+pp)%, and gives the num- 
ber of possible orderings of N4 events A and Ng = N — Nag events B. The characteristic 


function for this discrete distribution is given by 


N 


Pull) (EON) = 2 VN giPa Pee = (pac +78) 
Na=0 
(11.21) 
The resulting cumulant generating function is 
Inpy(k) = N In (pae~™ + pp) = NInpi(k), (11.22) 


where In p(k) is the cumulant generating function for a single trial. Hence, the cumulants 
after N trials are simply N times the cumulants in a single trial. In each trial, the allowed 
values of N4 are 0 and 1 with respective probabilities pg and py, leading to (N a =P as 


for all 2. After N trials the first two cumulants are 
(Na), =Npa , (N4),=N(pa-pa)=Npaps . (11.23) 


A measure of fluctuations around the mean is provided by the standard deviation, which is 
the square root of the variance. While the mean of the binomial distribution scales as N, 
its standard deviation only grows as VN. Hence, the relative uncertainty becomes smaller 
for large N. 

The binomial distribution is straightforwardly generalized to a multinomial distribu- 
tion, when the several outcomes {A, B,---,M} occur with probabilities {p4,pp,---,pm}. 
The probability of finding outcomes {N4, Ng,:::, Nas} inatotal of N= N4+Ne--:-+Nu 
trials is 

pw (Na Ney \Naa}) = PM vy? he (UL 2A) 
(3) The Poisson distribution: The classical example of a Poisson process is radioactive 
decay. Observing a piece of radioactive material over a time interval 7’ shows that: 
(a) The probability of one and only one event (decay) in the interval [t,t + dt] is propor- 

tional to dt as dt — 0, 

(b) The probabilities of events at different intervals are independent of each other. 

The probability of observing exactly M decays in the interval T is given by the Poisson 
distribution. It is obtained as a limit of the binomial distribution by subdividing the 
interval into N = T/dt >> 1 segments of size dt. In each segment, an event occurs with 


probability p = adt, and there is no event with probability g = 1—adt. As the probability 


30 


of more than one event in dt is too small to consider, the process is equivalent to a binomial 


one. Using eq.(II.21), the characteristic function is given by 
T/dt 


Dk) = (pe~"* + q)" m [1 + adt (e** — 1)] 


=i = tk 
dim exp [a(e 1)T] (11.25) 


The Poisson PDF is obtained from the inverse Fourier transform in eq.(II.6) as 


ad CO CO M 
pte) = f~ Bexpfa(e* 1) 7 + ike] = ene? [~ Bete > OT 


SO a6 20 M! ; 
M=0 

(11.26) 

using the power series for the exponential. The integral over k is 

edi | eae 
ie. aos = 6(x—M), (11.27) 
leading to 
eer eb" 

Pat (x) = S- eo yr Ole —M). (11.28) 


The probability of M events is thus par(M) = e~°? (aT)“/M!. The cumulants of the 


distribution are obtained from the expansion 


(—ik)” 
l 


ao = (M"),=0T (11.29) 


Inpar(k) =aT(e~* -1)=aT > 
n= 1 


All cumulants have the same value, and the moments are obtained from eqs.(II.12) as 
(M) =(aT), (M?)=(aT)?+(aT), (M*) = (aT)? +3(aT)?+(aT). — (IL30) 


Example: Assuming that stars are randomly distributed in the galaxy (clearly unjustified) 
with a density n, what is the probability that the nearest star is at a distance R? 

Since, the probability of finding a star in a small volume dV is ndV, and they are 
assumed to be independent, the number of stars in a volume V is described by a Poisson 
process as in eq.(II.28), with a = n. The probability p(R), of encountering the first star 
at a distance R is the product of the probabilities p,y(0), of finding zero stars in the 
volume V = 47R?/3 around the origin, and ppgy(1), of finding one star in the shell of 
volume dV = 47R7dR at a distance R. Both pny (0) and pnay (1) can be calculated from 
eq. (11.28), and 


p(R)dR = pnv (0) pnav (1) =e 477/38 eA RPndR jr R2ndR, 


4 IL.31 
— ——p (R) =4n R?nexp (-FRn) a) 


dl 


II.D Many Random Variables 


With more than one random variable, the set of outcomes is an N-dimensional space, 
Sx = {—cO < £1, %2,:-++,uN < co}. For example, describing the location and velocity of a 
gas particle requires six coordinates. 
e The joint PDF p(x), is the probability density of an outcome in a volume element 
d\x = They dx; around the point x = {x1,2%2,---,xn}. The joint PDF is normalized 
such that 
Dx(S) = [arx pix. (11.32) 


If, and only if, the N random variables are independent, the joint PDF is the product of 
individual PDFs, 


N 
p(x) = [] pi) (11.33) 


e The unconditional PDF describes the behavior of a subset of random variables, indepen- 
dent of the values of the others. For example, if we are interested only in the location of 
a gas particle, an unconditional PDF can be constructed by integrating over all velocities 


at a given location, p(#) = [ dt p(Z, 0); more generally 


N 
i=m+1 
e The conditional PDF describes the behavior of a subset of random variables, for specified 
values of the others. For example, the PDF for the velocity of a particle at a particular 
location #, denoted by p(v | Z), is proportional to the joint PDF p(t | Z) = p(%,v)/N. 


The constant of proportionality, obtained by normalizing p(w | Z), is 


N= pes p(z,v) = pz), (11.35) 


the unconditional PDF for a particle at 7. In general, the unconditional PDFs are obtained 


from Bayes’ Theorem as 


p(@1,-°-+,2N) 


es II.36 
Dimes ON) ( ) 


DLs ** Bre | Dai pase? <8) = 


Note that if the random variables are independent, the unconditional PDF is equal to the 
conditional PDF. 


32 


e The expectation value of a function F(x), is obtained as before from 


(F(x)) = [perx p(x) F(x) . (11.37) 


e The joint characteristic function, is obtained from the N-dimensional Fourier transfor- 
mation of the joint PDF, 


N 
ih) = (ex i) hye (11.38) 


The joint moments and joint cumulants are generated by p(k) and In p(k) respectively, as 


The previously described graphical relation between joint moments (all clusters of labelled 


(11.39) 


points), and joint cumulant (connected clusters) is still applicable. For example 


(Lop) = (La). (2e).+ La%e), , and 


2 (11.40) 
(x08) = (Ca)e (ta), + (@a), (Ua), + 2 (Tats), (Cade + (7a), 


The connected correlation, (%q%g),, is zero if Y_ and xg are independent random variables. 
e The joint Gaussian distribution is the generalization of eq.(II.15) to N random variables, 


as 


Dx) = TTS exp -5 SS (Cues ee (Lm — Am) (Ln — An) ; (11.41) 
where C is a symmetric matrix, and C~! is its inverse, The simplest way to get the 
normalization factor is to make a linear transformation from the variables y; = x7; — ;, 
using the unitary matrix that diagonalizes C. This reduces the normalization to that of 
the product of N Gaussians whose variances are determined by the eigenvalues of C’. The 
product of the eigenvalues is the determinant det[C]. (This also indicates that the matrix 
C must be positive definite.) The corresponding joint characteristic function is obtained 


by similar manipulations, and is given by 
: 1 
p(k) = exp | —ikmAm — gCmnkmin , (11.42) 


33 


where the summation convention is used. 


The joint cumulants of the Gaussian are then obtained from In p(k) as 
(Pos = Ani y] (Day yes oa Cnn 5 (11.43) 


with all higher cumulants equal to zero. In the special case of {A;,} = 0, all odd moments 
of the distribution are zero, while the general rules for relating moments to cumulants 
indicate that any even moment is obtained by summing over all ways of grouping the 


involved random variables into pairs, e.g. 
Cqliiela) = CaiCod + CacCha =F Cade pe: (11.44) 


This result is sometimes referred to as Wick’s theorem. 


Il.E Sums of Random Variables & the Central Limit Theorem 


Consider the sum X = ee x;, where x; are random variables with a joint PDF of 
p(x). The PDF for X is 


px(x) = [es p(x)d (x - Sai) = [Tee p(@41,°°+,2N-1,0 —21°+:—2ZNn-1), 
= (11.45) 


and the corresponding characteristic function (using eq.(II.38)) is given by 


N 
px(k) _ (os —tik Sox; | = Dp (ki ko ea kn k) : (11.46) 


~ (-ik)? = 
Inp (ky ko sine kn k) —ik d Dale + 5) Py (reds \ sn 5 (11.47) 
as Fe es 
(X), = Se ans ’ (X*) = S_ @a;), 9 tte (11.48) 
i=1 i,9 


If the random variables are independent, p(x) = [| pi(xi), and px(k) = []|Di(k). 


The cross-cumulants in eq.(II.48) vanish, and the n** cumulant of X is simply the sum 


of the individual cumulants, (X”), = ys (x?).. When all the N random variables 


iS 


34 


are independently taken from the same distribution p(x), this implies (X"), = N (x”)., 
generalizing the result obtained previously for the binomial distribution. For large values 
of N, the average value of the sum is proportional to N, while fluctuations around the 
mean, as measured by the standard deviation, grow only as VN. The random variable 
y = (X — N(z),)/VN, has zero mean, and cumulants that scale as (y"), « N1~™/?. As 
N — ~, only the second cumulant survives, and the PDF for y converges to the normal 


distribution , 


| _ Lisi N(@).)_ 1 y 
lim p ( = ete = an (-x6) : (11.49) 


(Note that the Gaussian distribution is the only distribution with only first and second 
cumulants. ) 

The convergence of the PDF for the sum of many random variables to a normal 
distribution is a most important result in the context of statistical mechanics where such 
sums are frequently encountered. The central limit theorem states a more general form of 
this result: It is not necessary for the random variables to be independent, as the condition 
su (vj, +++ 2i,,), < O(N™/?), is sufficient for the validity of eq.(II.49). 


21s 4tm 


II.F Rules for Large Numbers 


To describe equilibrium properties of macroscopic bodies, statistical mechanics has to 
deal with the very large number N, of microscopic degrees of freedom. Actually, taking 
the thermodynamic limit of N — oo leads to a number of simplifications, some of which 
are described in this section. 

There are typically three types of N dependence encountered in the thermodynamic 
limit: 

(a) Intensive quantities, such as temperature JT, and generalized forces, e.g. pressure P, 
and magnetic field B, are independent of N, ic. O(N®). 
(b) Extensive quantities, such as energy EF, entropy S, and generalized displacements, e.g. 

volume V, and magnetization M, are proportional to N, ie. O(N?). 

(c) Exponential dependence, i.e. O( exp(N o)), is encountered in enumerating discrete 
micro-states, or computing available volumes in phase space. 

Other asymptotic dependencies are certainly not ruled out a priori. For example, the 
Coulomb energy of N ions at fixed density scales as Q?/R ~ N®/3. Such dependencies 


are rarely encountered in every day physics. The Coulomb interaction of ions is quickly 


35 


screened by counter-ions, resulting in an extensive overall energy. (This is not the case in 
astrophysical problems since the gravitational energy can not be screened. For example 
the entropy of a black hole is proportional to the square of its mass.) 

In statistical mechanics we frequently encounter sums or integrals of exponential vari- 
ables. Performing such sums in the thermodynamic limit is considerably simplified due to 
the following results. 

(1) Summation of Exponential Quantities 


Consider the sum 
Sa) 8. 4 (1.50) 
where each term is positive, with an exponential dependence on N, 


0<E;~ O(exp(V¢;)), (11.51) 


and the number of terms N, is proportional to some power of N. Such a sum can be 
approximated by its largest term Ey,ax, in the following sense. Since for each term in the 
sum,.0.< 6; -< Esa, 

Cai, SSE INE A (11.52) 


An intensive quantity can be constructed from InS/N, which is bounded by 


Incas Z Ins > li Eas fe InN . 


Il. 
DNs SS IN N AE) 
For N x N?, the ratio nN/N vanishes in the large N limit, and 
lnS:  Iné 
lim — = Oy «5 11.54 
iim aH W od (11.54) 
(2) Saddle Point Integration 
Similarly, an integral of the form 
L= ju exp(N¢(zx)) (11.55) 


can be approximated by the maximum value of the integrand, obtained at a point tmax 


which maximizes the exponent ¢(z). Expanding around this point, 


T= i dx exp iN [O(c - 510" (max) (2 —2max)? +°° | \ (11.56) 


36 


Note that at the maximum, the first derivative $'(@max), is zero, while the second derivative 


0" (fmax), is negative. Terminating the series at the quadratic order results in 


x N 20 eo 
ii NH Bee) fu exp [-F1o"(emadl = Ps) od {| Ni" G@eal’ max) (11.57) 


where the range of integration has been extended to [—oo, co]. The latter is justified since 
the integrand is negligibly small outside the neighborhood of ¢max. 

There are two types of correction to the above result. Firstly, there are higher or- 
der terms in the expansion of ¢(xz) around Zmax. These corrections can be looked at 
perturbatively, and lead to a series in powers of 1/N. Secondly, there may be addi- 


tional local maxima for the function. A maximum at 2’ leads to a similar Gaus- 


sian integral that can be added to eq.(II.57). Clearly such contributions are smaller by 
O(exp{—N[@(amax) — O(2)nax)]})- Since all these corrections vanish in the thermodynamic 
limit, 

(NIP" mes) 


nt 1 
lim — = li max) — =! 
im int. }OGrinas) n i 


iT 
N—oco N N—-0o0o IN ) a O(a) a (max) : (11.58) 


The saddle point method for evaluating integrals is the extension of the above result to 
more general integrands, and integration paths in the complex plane. (The appropriate 
extremum in the complex plane is a saddle point.) The simplified version presented above 
is sufficient for the purposes of this course. 

e Stirling’s approximation for N! at large N can be obtained by saddle point integration. 


In order to get an integral representation of N!, start with the result 


a 1 
i dies (11.59) 
0 a 


Repeated differentiation of both sides of the above equation with respect to a@ leads to 


os N! 
| tip e = GNI" (11.60) 
0 


Although the above result only applies to integer N, it is possible to define by analytical 


continuation a function, 
I(N+1)=N!= i) dxaNe~*, (11.61) 
0 


37 


for all N. While the integral in eq.(II.61) is not exactly in the form of eq.(II.55), it can 
still be evaluated by a similar method. The integrand can be written as exp (N o(zx)), with 
o(xz) =Inax—2/N. The exponent has a maximum at tmax = N, with (max) = nN —-1, 
and $"(%max) = —1/N?. Expanding the integrand in eq.(II.61) around this point yields, 


1 
Nix ju exp (NInN —-N—- ay (t - N)?) = NNeN Van, (11.62) 


where the integral is evaluated by extending its limits to [—oo, co]. Stirling’s formula is 
obtained by taking the logarithm of eq.(II.62) as, 


if if 
InN!=NInN—-N + 5 InN) + O(5). (11.63) 


II.G_ Information, Entropy, and Estimation 


e Information: Consider a random variable with a discrete set of outcomes S = {2;}, 
occurring with probabilities {p(i)}, for i = 1,---,M. In the context of information theory, 
there is a precise meaning to the information content of a probability distribution: Let us 
construct a message from N independent outcomes of the random variable. Since there are 
M possibilities for each character in this message, it has an apparent information content of 
N lng M bits; i.e. this many binary bits of information have to be transmitted to convey the 
message precisely. On the other hand, the probabilities {p(z)} limit the types of messages 
that are likely. For example, if po > py, it is very unlikely to construct a message with 
more x; than xg. In particular, in the limit of large N, we expect the message to contain 
“roughly” {N; = Np;} occurrences of each symbol.' The number of typical messages thus 
corresponds to the number of ways of rearranging the {N;} occurrences of {x;}, and is 
given by the multinomial coefficient 
N! 


— ne 11.64 
meee N;! 


g 


This is much smaller than the total number of messages M”. To specify one out of g 


possible sequences requires 


M 
Ingg & —NY p; Ing p; (for N — co), (11.65) 


i=1 


* More precisely, the probability of finding any N; that is different from Np; by more 
than +VN becomes exponentially small in N, as N — oo. 


38 


bits of information. The last result is obtained by applying Stirling’s approximation for 


In N!. It can also be obtained by noting that 


N M _N, M 
1= (xe) => TI ~g][ pi”, (1.66) 
i INA oe i=1 

where the sum has been replaced by its largest term, as justified in the previous section. 
Shannon’s Theorem proves more rigorously that the minimum number of bits necessary 

to ensure that the percentage of errors in N trials vanishes in the N — oo limit, is Ing g. 

For any non-uniform distribution, this is less than the N Ing M bits needed in the absence 

of any information on relative probabilities. The difference per trial is thus attributed to 


the information content of the probability distribution, and is given by 


M 
T{{p;}] =IngM+ So pilngp; - (II.67) 

i=1 
e Entropy: Eq.(II.64) is encountered frequently in statistical mechanics in the context of 
mixing M distinct components; its natural logarithm is related to the entropy of mixing. 


More generally, we can define an entropy for any probability distribution as 


M 
S=—) pli) Inp(i) =—(np(i)) (11.68) 

i=l 
The above entropy takes a minimum value of zero for the delta-function distribution 
p(i) = 6;,;, and a maximum value of In M for the uniform distribution, p(i) = 1/M. S$ 
is thus a measure of dispersity (disorder) of the distribution, and does not depend on the 
values of the random variables {x;}. A one to one mapping to f; = F(x;) leaves the 
entropy unchanged, while a many to one mapping makes the distribution more ordered 
and decrease S. For example, if the two values, x; and x2, are mapped onto the same f, 


the change in entropy is 


P1 


AS (1,22 — f) = |piln 2 
Pi + p2 Pi + po 


<0. (11.69) 


e Estimation: The entropy S, can also be used to quantify subjective estimates of probabil- 
ities. In the absence of any information, the best unbiased estimate is that all M outcomes 
are equally likely. This is the distribution of maximum entropy. If additional information 


is available, the unbiased estimate is obtained by maximizing the entropy subject to the 


39 


constraints imposed by this information. For example, if it is known that (F(x)) = f, we 


can maximize 


S (a, 8, {pi}) = - » p(i) np(i) — a (= p(i) — : = (= p(i)F (a) — ‘ , _ (II.70) 


where the Lagrange multipliers a and @ are introduced to impose the constraints of nor- 
malization, and (F'(x)) = f, respectively. The result of the optimization is a distribution 
pi X exp ( — BF (a); where the value of ( is fixed by the constraint. This process can 
be generalized to an arbitrary number of conditions. It is easy to see that if the first n 
moments (and hence n cumulants) of a distribution are specified, the unbiased estimate is 
the exponential of an n‘® order polynomial. 

In analogy with eq.(II.68), we can define an entropy for a continuous random variable 


(S, = {—0o < x < co}) as 


= - f dep(z) Inp(z) =—(Inp(z)) . (11.71) 


There are, however, problems with this definition, as for example S is not invariant under 
a one to one mapping. (After a change of variable to f = F(x), the entropy is changed 
by (|F’(a)|).) Since the Jacobian of a canonical transformation is unity, canonically con- 
jugate pairs offer a suitable choice of coordinates in classical statistical mechanics. The 
ambiguities are also removed if the continuous variable is discretized. This happens quite 
naturally in quantum statistical mechanics where it is usually possible to work with a 
discrete ladder of states. The appropriate volume for discretization of phase space is set 


by Planck’s constant h. 


AO 


III. Kinetic Theory of Gases 


IIl.A General Definitions 


e Kinetic theory studies the macroscopic properties of large numbers of particles, start- 
ing from their (classical) equations of motion. 

Thermodynamics describes the equilibrium behavior of macroscopic objects in terms 
of concepts such as work, heat, and entropy. The phenomenological laws of thermody- 
namics tell us how these quantities are constrained as a system approaches its equilibrium. 
At the microscopic level, we know that these systems are composed of particles (atoms, 
molecules), whose interactions and dynamics are reasonably well understood in terms of 
more fundamental theories. If these microscopic descriptions are complete, we should be 
able to account for the macroscopic behavior, i.e. derive the laws governing the macro- 
scopic state functions in equilibrium. Kinetic theory attempts to achieve this objective. 
In particular, we shall try to answer the following questions: 

(1) How can we define “equilibrium” for a system of moving particles? 
(2) Do all systems naturally evolve towards an equilibrium state? 
(3) What is the time evolution of a system that is not quite in equilibrium? 

The simplest system to study, the veritable work—horse of thermodynamics, is the 
dilute (nearly ideal) gas. A typical volume of gas contains of the order of 107° particles. 
Kinetic theory attempts to deduce the macroscopic properties of the gas from the time 
evolution of the individual atomic coordinates. At any time t, the microstate of a system 
of N particles is described by specifying the positions g(t), and momenta p;(t), of all 
particles. The microstate thus corresponds to a point p(t), in the 6N dimensional phase 


space T = [Eta The time evolution of this point is governed by the canonical 


equations 
OG OH 
Ot OD; 
a, 7 oH ‘ (III.1) 
Ot t«éD*DE; 


where the Hamiltonian H(p, q), describes the total energy in terms of the set of coordinates 
q = {%,@,:::, dv}, and momenta p = {pf}, po2,---,fn}. The microscopic equations of 
motion have time reversal symmetry, i.e. if all the momenta are suddenly reversed, p — 
—p, at t = 0, the particles retrace their previous trajectory, q(t) = q(—t). This follows 


from the invariance of 1 under the transformation T(p,q) — (—p, q). 


Al 


As formulated within thermodynamics, the macrostate M, of an ideal gas in equi- 
librium is described by a small number of state functions such as E, T, P, and N. The 
space of macrostates is considerably smaller than the phase space spanned by microstates. 
Therefore, there must be a very large number of microstates corresponding to the same 
macrostate M/. 

This many to one correspondence suggests the introduction of a statistical ensemble 
of microstates. Consider NV copies of a particular macrostate, each described by a different 
representative point t,(t), in the phase space T. Let dN (p,q,t) equal the number of 
representative points in an infinitesimal volume dl. = Ih, d°p,d°q, around the point 


(p,q). A phase space density p(p,q,t) is then defined from 


p(p,a,t)dl = lim dN Past), 


A NG (III.2) 


This quantity can be compared with the objective probability introduced in the previous 
section. Clearly { dp = 1, and p is a properly normalized probability density function in 
phase space. To compute macroscopic values for various functions O(p,q), we shall use 


the ensemble averages 
(O) = ‘i dV'p(p, a, t)O(p, qa). (III.3) 


When the exact microstate ys is specified, the system is said to be in a pure state. 
On the other hand, when our knowledge of the system is probabilistic, in the sense of its 
being taken from an ensemble with density p(T), it is said to belong to a mized state. It 
is difficult to describe equilibrium in the context of a pure state, since p(t) is constantly 
changing in time according to eqs.(III.1). Equilibrium is more conveniently described for 
mixed states by examining the time evolution of the phase space density p(t), which is 


governed by the Liouville’s equation introduced in the next section. 


IlI.B Liouville’s Theorem 


e Liouville’s Theorem states that the phase space density p(I,t), behaves like an 
incompressible fluid. 

Proof: Follow the evolution of dN pure states in an infinitesimal volume dI’ = 
eee d°p,d°q; around the point (p,q). According to eqs.(III.1), after an interval ot these 


states have moved to the vicinity of another point (p’,q’), where 


dy = da + dadt + O(6t”) , pl, =pat padt + O(6t?). (IIL.4) 


42 


In the above expression, the qq and pq refer to any of the 6N coordinates and momenta, 
and gq and pag are the corresponding velocities. The original volume element dI’, is in the 
shape of a hyper-cube of sides dpa and dqq. In the time interval dt it gets distorted, and 


the projected sides of the new volume element are given by 


oq 


dq’, =dda + ——dqadt + O(dt?) 

Oda 

5s (III.5) 
dpl, =dpo + Fp Past + O(6t?) 


To order of 5t?, the new volume element is dI’ = coe d?p;'d°q,'.. From eqs.(III.5) it 


follows that for each pair of conjugate coordinates 


Oda ODe 2 
dp, = . i. —+— : Ill. 
dq, : dp, = dda + dpa + (Se + cn) dt + O(dt*) (111.6) 


But since the time evolution of coordinates and momenta are governed by the canonical 


eqs.(III.1), we have 


2 2 
Oda 00H OH d Oda 6) ( oH) = OH (111.7) 


Aga 4a OPa— PPaDe.’ oa Opa OPa Oda ~— OqaOPa 


Thus the projected area in eq.(III.6) is unchanged for any pair of coordinates, and hence 
the volume element is unaffected, dI’ = dI’. All the pure states dN, originally in the 
vicinity of (p,q) are transported to the neighborhood of (p’,q’), but occupy exactly the 
same volume. The ratio dN//dI is left unchanged, and p behaves like the density of an 
incompressible fluid. 

The incompressibility condition p(p’,q’,t + dt) = p(p,q,t), can be written in differ- 


ential form as 


dp Op .~\( 8p dpa , Op dda 
dp _ Op ODS Oe pe OEE 9 SOM. IIL8 
dt Ot oD Any. de Od ae) 


Note the distinction between 0p/0t and dp/dt: The former partial derivative refers to the 
changes in p at a particular location in phase space, while the latter total derivative follows 
the evolution of a volume of fluid as it moves in phase space. Substituting from eq.(III.1) 


into eq.(III.8) leads to 


= —{p,H}, (III.9) 


where we have introduced the Poisson bracket of two functions in phase space as 


3N 
dA OB 0A OB 
{A,B} = 2 (== a pa 5) = —{B, A}. (III.10) 


(1) Under the action of time reversal, (p,q,t) — (—p,q, —t), the Poisson bracket {p, 7H} 
changes sign, and eq.(III.9) implies that the density reverses its evolution, i.e. p(p,q,t) = 
p(—Pp, q; —t). 

(2) The time evolution of the ensemble average in eq.(III.3) is given by (using eq.(III.9)) 


d(0) = far APato a =>: froin. (Fa se se) 


(111.11) 
The partial derivatives of p in the above equation can be removed by using the method 
of integration by parts, ie. | fp’ = — fpf’ since p vanishes on the boundaries of the 


integrations, leading to 


00 dH 00 OH PH PH 
--> are Oe Waa! id -Ooas \OpsOda Ose 


: “2 dT p{H, O} = ({O,1}). 


Note that the total time derivative cannot be taken inside the integral sign, i.e. 


Oh g [ar Phd o(p, 9) (HIL13) 


This common mistake yields d (O) /dt = 0! 
(3) If the members of the ensemble correspond to an equilibrium macroscopic state, the 
ensemble averages must be independent of time. This can be achieved by a stationary 


density, Opeq/Ot = 0, i.e. by requiring 
{peqs H} = 0. (111.14) 


A possible solution to the above equation is for peg to be a function of H, i.e. Peq(P,q) = 
p(H(p,q)). It is then easy to verify that {p(H),H} = p’(H){H, H} = 0. This solution 
implies that the value of p is constant on surfaces of constant energy 7, in phase space. This 
is indeed the basic assumption of statistical mechanics. For example, in the microcanonical 


ensemble, the total energy EF of an isolated system is specified. All members of the ensemble 


44 


must then be located on the surface H(p,q) = E in phase space. Eq.(III.9) implies 
that a uniform density of points on this surface is stationary in time. The assumption 
of statistical mechanics is that the macrostate is indeed represented by such a uniform 
density of microstates. This is equivalent to replacing the objective measure of probability 
in eq.(III.2) with a subjective one. 

There may be additional conserved quantities associated with the Hamiltonian which 
satisfy {L,,7H} = 0. In the presence of such quantities, a stationary density exists for any 
function of the form peq(p,q) = p(H(p, q), Li(p,q), L2(p,q),-: -). Clearly, the value of 


L,, is not changed during the evolution of the system, since 


dLn(p,q) _ Ln (p(t + dt), q(t + dt)) — Ln (p(t), a) 
dt dt 

—— (8Ln Opa , Abn 8ga 
Op, Ot Oda Ot 


(111.15) 


= OLn OH Ln OH 
7 OPa Oda da OPa 


Hence, the functional dependence of peq on these quantities merely indicates that all ac- 
cessible states, i.e. those that can be connected without violating any conservation law, 
are equally likely. 

(4) The above postulate for peq answers the first question posed at the beginning of 
this chapter. However, in order to answer the second question, and to justify the basic 
assumption of statistical mechanics, we need to show that non-stationary densities converge 
onto the stationary solution peg. This contradicts the time reversal symmetry noted in (1) 
above: For any solution p(t) converging to peg, there is a time reversed solution that 
diverges from it. The best that can be hoped for is to show that the solutions p(t) are in 
the neighborhood of eq the majority of the time, so that time averages are dominated by 
the stationary solution. This brings us to the problem of ergodicity, which is whether it is 
justified to replace time averages with ensemble averages. In measuring the properties of 
any system, we deal with only one representative of the equilibrium ensemble. However, 
most macroscopic properties do not have instantaneous values and require some form 
of averaging. For example, the pressure P exerted by a gas results from the impact of 
particles on the walls of the container. The number and momenta of these particles varies 
at different times and different locations. The measured pressure reflects an average over 


many characteristic microscopic times. If over this time scale the representative point of 


45 


the system moves around and uniformly samples the accessible points in phase space, we 
may replace the time average with the ensemble average. For a few systems it is possible 
to prove an ergodic theorem, which states that the representative point comes arbitrarily 
close to all accessible points in phase space after a sufficiently long time. However, the 
proof usually works for time intervals that grow exponentially with the number of particles 
N, and thus exceed by far any reasonable time scale over which the pressure of a gas is 
typically measured. As such the proofs of the ergodic theorem have so far little to do with 


the reality of macroscopic equilibrium. 


46 


III.C The Bogoliubov-Born-Green-Kirkwood-Yvon Hierarchy 


The full phase space density contains much more information than necessary for de- 
scription of equilibrium properties. For example, knowledge of the one particle distribution 
is sufficient for computing the pressure of a gas. A one particle density refers to the ex- 
pectation value of finding any of the N particles at location g, with momentum 7p, at time 


t, which is computed from the full density p as 


N 
file, gt) = (S 5 (p— p)P(¢—- “)) 


(111.16) 
=N f [Le pe aio = Fd = a Be. y+ Po dvs) 
i=2 
To obtain the second identity above, we used the first pair of delta functions to perform one 
set of integrals, and then assumed that the density is symmetric with respect to permuting 
the particles. Similarly, a two particle density can be computed from 


N 
fo(pi, i, P2, qa, t) = N(N = 1) /U dV; p(p1, i, P2, Q, aoe ,PN, qn, t), (III.17) 
i=3 
where dV; = d°p;,d°q;, is the contribution of particle 7 to phase space volume. The general 


s-particle density is defined by 


ad ' dV, ee Ill 

MPs *5ds36) = = — a ’ ,t) = ———~Ps(P1,-*-, ds, t), 18 
where p; is a standard unconditional PDF for the coordinates of s particles, and py = p. 
While p, is properly normalized to unity when integrated over all its variables, the s- 
particle density has a normalization of N!/(N — s)!. We shall use the two quantities 
interchangeably. 

The evolution of the few-body densities is governed by the BBGKY hierarchy of 
equations attributed to Bogoliubov, Born, Green, Kirkwood, and Yvon. The simplest 


non-trivial Hamiltonian studied in kinetic theory is 


No tae 3 N 
Hip.) =) |E- +u@)] +5 Va-a). (1I1.19) 


=n 6,=1 


This Hamiltonian provides an adequate description of a weakly interacting gas. In addition 


to the classical kinetic energy of particles of mass m, it contains an external potential 


AT 


U, and a two-body interaction V, between the particles. In principle, three and higher 
body interactions should also be included for a realistic description, but they are not very 
important in the dilute gas (nearly ideal) limit. 


For evaluating the time evolution of f,, it is convenient to divide the Hamiltonian into 
H=H,+Hy_s +H, (III.20) 


where H, and Hy_, include only interactions among each group of particles, 


n=1 2m (nym)=1 
x pe : (IIT.21) 
Hxe= YO [Ro+om@) +3 YS v@G-a) 
i=s+1 (i,j)=s+1 


while the interparticle interactions are contained in 


s N 
H=S° So Vn -&)- (II1.22) 


n=1 i=s+1 


From eq.(III.18), the time evolution of f, (or p;) is obtained as 


N 
“bs = if I av, 2 --| [] oi fe, Hs + Hn-s +H}, (III.23) 


t=st+1 


where eq.(III.9) is used for the evolution of p. The three Poisson brackets in eq.(III.23) 
will now be evaluated in turn. Since the first s coordinates are not integrated, the order 


of integrations and differentiations for the Poisson bracket may be reversed, and 


N N 
/ I] (oma=<{ f I] ev: 0) 4) = (ra) (11.24) 


i=st+1 i=stl1 


Writing the Poisson brackets explicitly, the second term of eq.(III.23) takes the form 


Op OHn-s Op OHn-s 
af Tl aVs {p, Hwa} =f Il Se * | 


i=st+1 i=stl1 


(using eq. (III.21)) 


N N N 
dp (OU 1 OV(Gj — dk) Op Dj 
= dV; ae (eatealee ee ————. ] - => :— | = 0. TII.25 
/ U a» = (3 2 a Og; Og; m ( ) 


The last equality is obtained after performing the integrations by part: The term multi- 
plying 0p/0p; has no dependence on p;, while p;/m does not depend on qg;. The final term 
in eq.(III.23), involving the Poisson bracket with H’, is 


i=st+1 j=l 
N Ss N ar my N Ss oy 
7 Op OV(Gn — G) Op OV (Gj = In) 
Tete |e 2p ee se ee, 
i=st+1 n=1 ay j=stl In jg=stl Pj n=1 qj 


where the sum over all particles has been subdivided into the two groups. (Note that H’ 
in eq.(III.22) has no dependence on the momenta.) Integration by parts shows that the 
second term in the above expression is zero. The first term involves the sum of (N — s) 


expressions that are sa by symmetry and simplifies to 


N—s ge il dV; a ce Ys+1) oe: 


i=st+1 OPn 


=( eid agian ay) sa | f TL ae). 


i=s+2 


(111.26) 


Note that the quantity in the above square brackets is 9,41. Thus, adding up eqs.(III.24), 
(111.25), and (III.26), 


“be — 4% 0 s 
— {Hs, ps} =(N —s) »} | os a : oe ; (111.27) 
or in terms of the densities f,, 
ot 3 Puan Ofs41 
— tls i= = | os :. OD, . (III.28) 


In the absence of interactions with other particles, the density p, for a group of 
s particles evolves as the density of an incompressible fluid (as required by Liouville’s 
theorem), and is described by the streaming terms on the left hand side of eq.(III.27). 
However, because of interactions with the remaining N — s particles, the flow is modified 
by the collision terms on the right hand side. The collision integral is the sum of the terms 
corresponding to a potential collision of any of the particles in the group of s, with any 
of the remaining N — s particles. To describe the probability of finding the additional 
particle that collides with a member of this group, the result must depend on the joint 
PDF of s+1 particles described by p,4 . This results in a hierarchy of equations in which 
Op,/Ot depends on p2, Op2/Ot depends on pz, etc., which is at least as complicated as 
the original equation for the full phase space density. To proceed further, a physically 
motivated approximation for terminating the hierarchy is needed. 


A9 


III.D The Boltzmann Equation 


To estimate the relative importance of the different terms appearing in eqs.(III.28), 


let us examine the first two equations in the hierarchy, 


S-s 0 4 [f= fan ME 8) oe (111.29) 


dt Og, Op, Odi On: Opi’ 
and 
ara ea Pr Oo eo ee 2 
Ot Ogi Op Oda Ops m Ogi m Ode Odi Op} Ops : 
Via —-@) OA | OV(q—G) OA | 
FAV ee ae ag 
/ : a oe” eae ed 


(III.30) 
Note that two of the streaming terms in eq.(III.30) have been combined by using 
OV(h — G2)/OG%= —OV(q — Gi) /Og2, which is valid for a symmetric potential such that 
V(h — &)= Vie — GH). 
e Time scales: All terms within square brackets in the above equations have dimensions 
of inverse time, and we estimate their relative magnitudes by dimensional analysis, using 
typical velocities and length scales. The typical speed of a gas particle at room temperature 
is v © 10?ms~!. For terms involving the external potential U, or the inter-atomic potential 
Y, an appropriate length scale can be extracted from the range of variations of the potential. 
(a) The terms proportional to 

1 OU O 


tu OG Op’ 
involve spatial variations of the external potential U(q¢), which take place over macro- 
scopic distances L. We shall refer to the associated time Ty, as an extrinsic time scale, 
as it can be made arbitrarily long by increasing system size. For a typical value of 
L = 10-°m, we get ty = L/v = 107°s. 

(b) From the terms involving the inter-atomic potential V, we can extract two additional 
time scales, which are intrinsic to the gas under study. In particular, the collision 


duration 


1 ov oO 


te OG Op’ 

is the typical time over which two particles are within the effective range d, of their 
interaction. For short range interactions (including van der Waals and Lenard—Jones, 
despite their power law decaying tails), d + 10~!°m is of the order of a typical atomic 


size, resulting in T, © 10~!?s. This is usually the shortest time scale in the problem. 


50 


The analysis is somewhat more complicated for long range interactions, such as the 
Coulomb gas in a plasma. For a neutral plasma, the Debye screening length A replaces 
d in the above equation, as discussed in the problems. 

(c) There are also collision terms on the right hand side of eqs.(III.28), which depend on 


fs+1, and lead to an inverse time scale 


og Op ps | 


i OV O fs+1 i OV O . ps4 
/ Og Op fs 


Tx 


The integrals are only non-zero over the volume of the inter-particle potential d°?. The 
term f;+1/fs is related to the probability of finding another particle per unit volume, 
which is roughly the particle density n = N/V = 1076m~3. We thus obtain the mean 


free time 


Tes 1 
Tye & y — 
nd? nud?’ 


(111.31) 
which is the typical distance a particle travels between collisions. For short range 
interactions, T. & 1078s is much longer than 7,, and the collision terms on the right 
hand side of eqs.(III.28) are smaller by a factor of nd? = (1076m~3)(10~!°m)3 = 1074. 
The Boltzmann equation is obtained for short range interactions in the dilute regime by 

exploiting T./T,. ~ nd? < 1. (By contrast, for long range interactions such that nd? > 1, 

the Vlasov equation is obtained by dropping the collision terms on the left hand side, as 

discussed in the problems.) From the above discussion, it is apparent that eq.(III.29) is 
different from the rest of the hierarchy: It is the only one in which the collision terms are 
absent from the left hand side. For all other equations, the right hand side is smaller by 

a factor of nd’, while in eq.(III.29) it may indeed dominate the left hand side. Thus a 

possible approximation scheme is to truncate the equations after the first two, by setting 

the right hand side of eq.(III.30) to zero. 

Setting the right hand side of the equation for fz to zero implies that the two body 
density evolves as in an isolated two-particle system. The relatively simple mechanical 
processes that govern this evolution result in streaming terms for f2 which are proportional 
to both tT ‘ and ty1. The two sets of terms can be more or less treated independently: 
the former describe the evolution of the center of mass of the two particles, while the latter 


govern the dependence on relative coordinates. 


51 


The density f2 is proportional to the joint PDF p2 for finding one particle at (71, @), 
and another at (p2,q2), at the same time t. It is reasonable to expect that at distances 


much larger than the range of the potential V, the particles are independent, i.e. 


p2(P1, i, P2, q2,t) = p1(P1; hi, t)p1(P2, q2,t), or 
(111.32) 


fo(P1, i, P2, G2, t) — fii, ht) fie, @, t), for |g2 — q1| > d. 


The above statement should be true even for situations out of equilibrium. For example, 
imagine that the gas particles in a chamber are suddenly allowed to invade an empty volume 
after the removal of a barrier. The density f; will undergo a complicated evolution, and 
its relaxation time will be at least comparable to ty. The two particle density f2, will also 
reach its final value at a comparable time interval. However, it is expected to relax to a 
form similar to eq.(III.32) over a much shorter time of the order of T¢. 

For the collision term on the right hand side of eq.(III.29), we actually need the precise 
dependence of fz on the relative coordinates and momenta at separations comparable to d. 
At time intervals longer than 7, (but possibly shorter than ty), the ‘steady state’ behavior 
of fz at small relative distances is obtained by equating the largest streaming terms in 
eq. (III.30), ie. 


D° 2 OO IBdV(KN--E a) a) 
BB. 8) (Sa) |p (III.33) 
Op, Ope 
We expect fo(Gi, @2) to have slow variations over the center of mass coordinate Q = (@, + 


g2)/2, and large variations over the relative coordinate ¢ = q — q,. Therefore, 0f2/0¢ > 


Of2/9Q, and Ofo/q ¥ —Ofo/OG, © Af2/AG, leading to 


OV(G — q) ( 0 0 ) (4 —?) 0 
eee eee ——— ees . TI1.34 
Og, Op, Ope b m 


The above equation provides a precise mathematical expression for how f2 is constrained 
along the trajectories that describe the collision of the two particles. 
The collision term on the right hand side of eq.(III.29) can now be written as 


, a, IV(G - & O O 
= | brava a =) fo 


Op, Ops 


df, 
dt 


coll. 


(111.35) 


. g2 (Pe-Pi 0 2h een es 
~ | @ pas peoe ) - => fo (P1, 1, P2, 73 t) 
m Og 


The first identity if obtained from eq.(III.29) by noting that the added term proportional 
to Of2/O0p>2 is a complete derivative and integrates to zero, while the second equality follows 
from eq.(III.34), after the change of variables to 7 = q2—qi. (Since it relies on establishing 
the ‘steady state’ in the relative coordinates, this approximation is valid as long as we 
examine events in time with a resolution longer than 7;.) 

e Kinematics of collision and scattering: The integrand in eq.(III.35) is a derivative of 
fe with respect to ¢g along the direction of relative motion p = po — pi, of the colliding 
particles. To perform this integration we introduce a convenient coordinate system for q;, 
guided by the formalism used to describe the scattering of particles. Naturally, we choose 
one axis to be parallel to p2 — p, with the corresponding coordinate a which is negative 
before a collision, and positive afterwards. The other two coordinates of ¢ are represented 
by an impact vector 6 which is 0 for a head-on collision ({p — 79] || [(q1 — @]). We can now 


integrate over a to get 


d Fs oe be Ey by ha SS =, ay ae 
a = ae d°b |b, — Bo! |fo(Pi, dh, Be, 6, +30) — foi, G1, P2,6,—;t)}, — (TI1.36) 
coll. 
where |v — &| = | — p2|/m is the relative speed of the two particles, with (6,—) and 


(b, +) referring to relative coordinates before and after the collision. Note that db |v — v9| 
is just the flux of particles impinging on the element of area db. 

In principle, the integration over a is from —oo to +o0, but as the variations of fo 
are only significant over the interaction range d, we can evaluate the above quantities 
at separations of a few d from the collision point. This is a good compromise, allowing 
us to evaluate f2 away from the collisions, but at small enough separations so that we 
can ignore the difference between g, and g2. This amounts to a coarse-graining in space 
which eliminates variations on scales finer than d. With these provisos, it is tempting to 
close the equation for f;, by using the assumption of uncorrelated particles in eq.(III.32). 
Clearly some care is necessary as a naive substitution gives zero! The key observation is 
that the densities fg for situations corresponding to before and after the collision have to 
be treated differently. For example, soon after opening of the slot separating empty and 
full gas containers, the momenta of the gas particles are likely to point away from the 
slot. Collisions will tend to randomize momenta, yielding a more isotropic distribution. 
However, the densities fz before and after the collision are related by streaming, implying 
that fo(D1, i, Do, 6, +31) = fo(Di', Ti, Do’, 6, —;t), where p,’ and pz’ are momenta whose 


collision at an impact vector b results in production of outgoing particles with momenta p} 


53 


and pz. They can be obtained using time reversal symmetry, by integrating the equations 
of motion for incoming colliding particles of momenta —p;, and —p>. In terms of these 


momenta, we can write 


df 
dt 


= f Ciel ~ 0 [fe awe 6.8) — fer. dian}, 0). (IIL.37) 
coll. 

It is sometimes more convenient to describe the scattering of two particles in terms of 
the relative momenta p = p; — p2 and p’ = p;' — p2", before and after the collision. For a 
given b, the initial momentum jp’ is deterministically transformed to the final momentum 
p’. To find the functional form p’(|p', b ), one must integrate the equations of motion. 
However, it is possible to make some general statements based on conservation laws: In 
elastic collisions, the magnitude of p is preserved, and it merely rotates to a final direction 
indicated by the angles (6, ¢) = Q(b ) (a unit vector) in spherical coordinates. Since there 


is a one to one correspondence between the impact vector b, and the solid angle Q, we 


df 


make a change of variables between the two, resulting in 
“dt a a V9| foi : i, P2 . b, 7; t) _ fo(P1, i, P2, b, 7) t) : 


do 
— 3. 72 
é: fe Po2Q Ba 
(III.38) 


The Jacobian of this transformation, |da/dQ|, has dimensions of area, and is known as 


coll. 


the differential cross-section. It is equal to the area presented to an incoming beam which 
scatters into the solid angle 2. The out-going momenta p)’ and po’ in eq.(III.38) are now 
obtained from the two conditions p,' + p2’ = p, + po (conservation of momentum), and 


pi’ — po’ = |p — P2|Q(b) (conservation of energy), as 

Bi! = (Bi + Ba + WP — PalQ(b)) /2, ane 
III.39 
Bo! = (di + Be — |r — B21Q(6)) /2. 


For the scattering of two hard spheres of diameter D, it is easy to show that the 
scattering angle is related to the impact parameter b by cos(@/2) = b/D for all ¢. The 


differential cross-section is then obtained from 


2 2 
d?a = bdbdd = Dos (5) Dsin (5) S do = — sin 6d0 dé = — d7Q. 


(Note that the solid angle in three dimensions is given by d?Q = sin 0d dd.) Integrating 


over all angles leads to the total cross-section of o = 7D”, which is evidently correct. The 


54 


differential cross-section for hard spheres is independent of both @ and |P |. This is not the 


case for soft potentials. For example, the Coulomb potential V = e?/ 1a) leads to 


7 sacs! 


(The dependence on |P | can be justified by obtaining a distance of closest approach from 
|P?/m + e?/b = 0.) 
e The Boltzmann equation is obtained from eq.(III.38) after the substitution 


do 
dQ 


fo(P1, Ti; P2; b, or t) at fil(Pi, qi, t) ‘ fi(pa, qi, ae (III.40) 
known as the assumption of molecular chaos. Note that even if one starts with an uncor- 
related initial probability distribution for particles, there is no guarantee that correlations 
are not generated as a result of collisions. The final result is the following closed form 
equation for f; 

dt OG OP m dh 
- | &rao Jo — | [fh Hh, fi (2, H,.8) — fio, GO fi (P2', G, b)]- 
(111.41) 


Given the complexity of the above ‘derivation’ of the Boltzmann equation, it is appro- 


do 
dQ 


priate to provide a heuristic explanation. The streaming terms on the left hand side of the 
equation describe the motion of a single particle in the external potential U. The collision 
terms on the right hand side have a simple physical interpretation: The probability of 
finding a particle of momentum p} at q; is suddenly altered if it undergoes a collision with 
another particle of momentum p2. The probability of such a collision is the product of 
kinematic factors described by the differential cross-section |do /dQ|, the ‘flux’ of incident 
particles proportional to |@2—«}|, and the joint probability of finding the two particles, ap- 
proximated by f(p1)fi(p2). The first term on the right hand side of eq.(III.41) subtracts 
this probability and integrates over all possible momenta and solid angles describing the 
collision. The second term represents an addition to the probability which results from the 
inverse process: A particle can suddenly appear with coordinates (1, 7.) as a result of a 
collision between two particles initially with momenta p;’ and 2’. The cross-section, and 
the momenta (p) ’, 2’) may have a complicated dependence on (pi, p2) and 2, determined 
by the specific form of the potential V. Remarkably, various equilibrium properties of the 


gas are quite independent of this potential. 


55 


III.E The H—Theorem and Irreversibility 


The second question posed at the beginning of this chapter was whether a collection 
of particles naturally evolves towards an equilibrium state. While it is possible to obtain 
steady state solutions for the full phase space density py, because of time reversal sym- 
metry these solutions are not attractors of generic non-equilibrium densities. Does the 
unconditional one particle PDF p1, suffer the same problem? While the exact density p1 
must necessarily reflect this property of py, the H-theorem proves that an approximate p1, 
governed by the Boltzmann equation, does in fact non-reversibly approach an equilibrium 
form. This theorem states that: 


e If fi(p,q,t) satisfies the Boltzmann equation, then dH/dt < 0, where 
HO) = faragAGam AGE) . (111.42) 


The function H(t) is related to the information content of the one particle PDF. Up to an 
overall constant, the information content of p; = f1/N is given by I[{p;] = (In), which 
is clearly similar to H(t). 

Proof: The time derivative of H is 


dH _ pee eg eapN = 3 Bo oft 
AE _ fe pid qd ay Unf +1) =fa pid qd In Tie ; (III.43) 


since [ dVi f, = N [ dlp = N is time independent. Using eq.(III.41), we obtain 


- | Ppa ad pad ols — v9 (fii, G) fie, G) — Aer", ) f(e2",H)) In fi (Pi, &), 
(111.44) 
where we shall interchangeably use d2c, d26, or d2Q|do/dQ| for the differential cross- 
section. The streaming terms in the above expression are zero, as shown through successive 


integrations by part, 


tad Sane OU Of; oU 1 Of; O OU 
end on 22h fopegn @1H- fopeane © - 
/ pid’q In fi Om On: Pid Gift Aa, fe Oni pid’ qi Om Odi 
and 
dates Dp, O eet pb, 10 Pee eres ae 
fenea In fi Pi en = — | ea fi 2” eae =f eadanz Ee). 
m Ogi m fi Odi Og, m 


56 


The collision term in eq.(III.44) involves integrations over dummy variables p) and pz. The 
labels (1) and (2) can thus be exchanged without any change in the value of the integral. 
Averaging the resulting two expressions gives 

aH. 


Ht ; [bara pa — ¥9| (f1 (71) fi(B2) — fio") fr Be") In (fi (1) fi. (P2)) - 


(111.45) 
(The arguments, ¢ and t, of f; are suppressed for ease of notation.) We would now like 
to change the variables of integrations from the coordinates describing the initiators of 
the collision, (71, Do, b), to those of their products, (pi ' po',b'). The explicit functional 
forms describing this transformation are complicated because of the dependence of the 
solid angle Q in eq.(IIL.39) on b and |j2 — p,|. However, we are assured that the Jacobian 
of the transformation is unity because of time reversal symmetry; since for every collision 


there is an inverse one obtained by reversing the momenta of the products. In terms of 


the new coordinates 


ae. [ ead’ ‘dp 'd?b "|v, — bo| [fi (Pi) fi(D2) — fi Pr) fr (Be ")] Im (F1 (1) fr (Be) , 


dt 2 
(111.46) 
where we should now regard (9, 2) in the above equation, as functions of the integration 
variables (jp; ',',6’) as in eq.(IIL.39). As noted earlier, |#, — %| = |%' — &’| for any 
elastic collision, and we can use these quantities interchangeably. Finally, we relabel the 
dummy integration variables such that the primes are removed. Noting that the functional 


> 


dependence of (p1, p2,0) on (p1',p2', b’) is exactly the same as its inverse, we obtain 


dH 1 eee eee > 2 2 = = 3 
= = 5 | aa pa pa? Je — 05) (fi) fr 2") — fi) fi 2) | In (fi Pr ') F.2D) - 
(111.47) 
Averaging eqs.(III.45) and (III.47) results in 
Faz | Cat pa pat? — 04 [fA ) — AV) 
de A (111.48) 


[In (f1 (1) f1 (B2)) — In (fH) fr (B2'))] - 


The integrand of the above expression is always positive. If f1(p1) fi(p2) > fi(pi') fi(pe’), 
both terms in square brackets are positive, while both are negative if fi (pi) fi(p2) < 
fi(p.") fi(p2"). In either case, their product is positive. The positivity of the integrand 
establishes the validity of the H-theorem, 

dH 

—<0 . III.49 

dt — ( ) 

57 


e Irreversibility: The second law is an empirical formulation of the vast number of everyday 
observations which support the existence of an arrow of time. Reconciling the reversibility 
of laws of physics governing the microscopic domain with the observed irreversibility of 
macroscopic phenomena is a fundamental problem. Of course, not all microscopic laws 
of physics are reversible: weak nuclear interactions violate time reversal symmetry, and 
the collapse of the quantum wave-function in the act of observation is irreversible. The 
former interactions in fact do not play any significant role in everyday observations that 
lead to the second law. The irreversible collapse of the wave-function may itself be an 
artifact of treating macroscopic observers and microscopic observables distinctly.‘ There 
are proponents of the view that the reversibility of the currently accepted microscopic 
equations of motion (classical or quantum) is indicative of their inadequacy. However, the 
advent of powerful computers has made it possible to simulate the evolution of collections of 
large numbers of particles, governed by classical, reversible equations of motion. Although 
simulations are currently limited to relatively small numbers of particles (10°), they do 
exhibit the irreversible macroscopic behaviors similar to those observed in nature (typically 


073 particles). For example, particles initially occupying one half of a box 


involving 1 
proceed to irreversibly, and uniformly, occupy the whole box. (This has nothing to do with 
limitations of computational accuracy; the same macroscopic irreversibility is observed in 
exactly reversible integer based simulations, such as with cellular automata.) Thus the 
origin of the observed irreversibilities should be sought in the classical evolution of large 
collections of particles. 

The Boltzmann equation is the first formula we have encountered that is clearly not 
time reversible, as indicated by eq.(III.49). We can thus ask the question of how we 
obtained this result from the Hamiltonian equations of motion. The key to this, of course, 
resides in the physically motivated approximations used to obtain eq.(III.41). The first 
steps of the approximation were dropping the three body collision term on the right hand 
side of eq.(III.30), and the implicit coarse-graining of the resolution in the spatial and 
temporal scales. Neither of these steps explicitly violates time reversal symmetry, and the 
collision term in eq.(III.37) retains this property. The next step in getting to eq.(III.41) is 
to replace the two-body density f2(—), evaluated before the collision, with the product of 
two one body densities according to eq.(III.32). This treats the two body densities before 


' The time dependent Schrédinger equation is fully time reversible. If it is possible 
to write a complicated wave-function that includes the observing apparatus (possibly the 


whole universe), it is hard to see how any irreversibility may occur. 


58 


and after the collision differently. We could have alternatively expressed eq.(III.37) in 
terms of the two body densities f2(+) evaluated after the collision. Replacing f2(+) with 
the product of two one particle densities would then lead to the opposite conclusion, with 
dH/dt > 0! For a system in equilibrium, it is hard to justify one choice over the other. 
However, once the system is out of equilibrium, the coordinates after the collision are more 
quite likely to be correlated, and hence the substitution of eq.(III.32) for f2(+) does not 
make sense. Time reversal symmetry implies that there should also be subtle correlations 
in f2(—) which are ignored in the so-called assumption of molecular chaos. 

While the assumption of molecular chaos before (but not after) collisions is the key 
to the irreversibility of the Boltzmann equation, the resulting loss of information is best 
justified in terms of the coarse graining of space and time: The Liouville equation and 
its descendants contain precise information about the evolution of a pure state. This 
information, however, is inevitably transported to shorter scales. A useful image is that 
of mixing two immiscible fluids. While the two fluids remain distinct at each point, the 
transitions in space from one to the next occur at finer resolution on subsequent mixing. 
At some point, a finite resolution in any measuring apparatus will prevent keeping track 
of the two components. In the Boltzmann equation the precise information of the pure 
state is lost at the scale of collisions. The resulting one body density only describes space 
and time resolutions longer than those of a two-body collision, becoming more and more 


probabilistic as further information is lost. 


III.F Equilibrium Properties 


What is the nature of the equilibrium state described by f;, for a homogeneous gas? 
(1) The equilibrium distribution: After the gas has reached equilibrium, the function H 
should no longer decrease with time. Since the integrand in eq.(III.48) is always positive, 


a necessary condition for dH/dt = 0 is that 


fie, A) fi (2, &) — fi’, G) Ai (2, Gi) = 9, (111.50) 


i.e. at each point g, we must have 


In fi (Pi, 7) + In fi (P2,7) = In fi@i', 7) + In fi (D2, Z). (11.51) 


The left hand side of the above equation refers to the momenta before a two-body collision, 


and the right hand side to the those after the collision. The equality is thus satisfied by 


59 


any additive quantity that is conserved during the collision. There are 5 such conserved 
quantities for an elastic collision: the particle number, the three components of the net 


momentum, and the kinetic energy. Hence, a general solution for f; is 


nA Sep eee) (=) | (111.52) 


2m 


We can easily accomodate the potential energy U(q@) in the above form, and set 


fed) =Na@ew|-ata)- va) (L+v@)|. ass) 


We shall refer to the above distribution as describing local equilibrium. While this form is 
preserved during collisions, it will evolve in time away from collisions, due to the streaming 
terms, unless {71, f:} = 0. The latter condition is satisfied for any function f; that 
depends only on #, or any other quantity that is conserved by it. Clearly, the above 
density satisfies this requirement as long as V, and ( are independent of ¢, and a = 0. 


According to eq.(III.16), the appropriate normalization for f, is 
/ PPG fi(6,9) =N. (111.54) 


For particles in a box of volume V, the potential U(q) is zero inside the box, and infinite 


on the outside. The normalization factor in eq.(III.53) can be obtained from eq.(III.54) as 


is 3 3/2 
v0 [feo (on 2) 0 (2) oo 


Hence, the properly normalized Gaussian distribution for momenta is 


fied) =n (=-) . exp a (III.56) 


27m 2m 


where po = (p) = m@/G is the mean value for the momentum of the gas, which is zero 
for a stationary box, and n = N/V is the particle density. From the Gaussian form of 
the distribution it can be easily concluded that the variance of each component of the 


momentum is (p?) = m/, and 


3m 


III.57 
B (III.57) 


(p*) = (p+ py + pz) = 


60 


(2) Equilibrium between two gases: Consider two different gases (a) and (b), moving in 
the same potential U, and subject to a two-body interaction Vay, (7 sag (8), We can 
define one-particle densities, fF, and, fF, for the two gases respectively. In terms of a 
generalized collision integral 
Caa=— | Pao aa oF tl [OG WAP B®) — LOH WA a] 

(111.58) 


the evolution of these densities is governed by a simple generalization of the Boltzmann 


equation to 


a (a) : ‘ 
ch __{ 9, HO) + Coe + Cap 
ae (111.59) 
Oft’ (b) 4,(b) 
Sp = {fi eg EY \ + Ch, at Cp, b 


Stationary distributions can be obtained if all six terms on the right hand side of eqs. (III.59) 
are zero. In the absence of inter-species collisions, i.e. for Cop = Cy,a, we can obtain 
independent stationary distributions fo? ox exp (-4.11{”) and FO ox exp (-41{”). 
Requiring the vanishing of Cp leads to the additional constraint, 

a)r— b);> he bys 

CDA B)-AO BAP (a!) =0, => 


(111.60) 
Bas” (Bi) + BrH® (Be) = BaH (Br) + BrH® (F2") 


Since the total energy HW + Hq”) is conserved in a collision, the above equation can be 


satisfied for Ga = Gy = G. From eq.(III.57) this condition implies the equality of the kinetic 


De ef ee a 8 
(fe) = (Fe) 5 (III.61) 


The parameter ( thus plays the role of an empirical temperature describing the equilibrium 


energies of the two species, 


of gases. 

(3) The equation of state: To complete the identification of 3 with temperature T’, consider 
a gas of N particles confined to a box of volume V. The gas pressure results from the force 
exerted by the particles colliding with the walls of the container. Consider a wall element 
of area A perpendicular to the x direction. The number of particles impacting this area, 


with momenta in the interval |p, p+ dp], over a time period dt, is 


dN (p) = (fi(p)d?p) (A ve ot). (II1.62) 


61 


The final factor in the above expression is the volume of a cylinder of height v,dt per- 
pendicular to the area element A. Only particles within this cylinder are close enough to 
impact the wall during 6t. As each collision imparts a momentum 2p, to the wall, the net 


force exerted is 
pe oa [ | dv-tup) (4 BE ot) pe) (11.63) 
Ot ee x ae y = zJ1 a xz) . 


As only particles with velocities directed towards the wall will hit it, the first integral is 
over half of the range of p,. Since the integrand is even in p,, this restriction can be 
removed by dividing the full integral by 2. The pressure P is then obtained from the force 
per unit area as 

F Bs a Bp?) _n 

7A / Bh(p)— — | DPzn\ >] exp grr ) 
where eq.(III.56) is used for the equilibrium form of f;. Comparing with the standard 
equation of state, PV = NkpT, for an ideal gas, leads to the identification, 6 = 1/kgT. 
(4) Entropy: As discussed earlier, the Boltzmann H-function is closely related to the infor- 
mation content of the one-particle PDF p;. We can also define a corresponding Boltzmann 


entropy, 
Sp(t) = —kpH(t), (III.65) 


where the constant kg reflects the historical origins of entropy. The H-theorem implies that 
Sz can only increase with time in approaching equilibrium. It has the further advantage 
of being defined through eq.(III.42) for situations that are clearly out of equilibrium. For 


a gas in equilibrium in a box of volume V, from eq.(III.56), we compute 
H=V fae AG) nA) 
N Z p n p 

—V / Pr —(2nmkpT)~3/2 = ee a ee 

/ Pag SAREE) OPA Set) | \ Ganket 2) taker 
1 n 3 

= n | ———_—__- |---|. 

(2amkpT)3/2 2 


The entropy is now identified as 


(III.66) 


Sp = —kpH=Nkp 5 + = In (2nmkpT) —In (+) : (111.67) 


62 


The thermodynamic relation, TdSg = dE + PdV, implies 


OE OSz 3 

77 ee 

yet eae +e (11.68) 
p+ QE| _p OSs) _ NkeT 

Ovi - “OV 


The usual properties of a monatomic ideal gas, PV = NkgT, and E = 3NkpT/2, can 
now be obtained from the above equations. Also note that for this classical gas, the zero 
temperature limit of the entropy in eq.(III.67) is not independent of the density n, in 


violation of the third law of thermodynamics. 


63 


IlI.G Conservation Laws 


e Approach to equilibrium: We now address the third question posed in the introduction, 
of how the gas reaches its final equilibrium. Consider a situation in which the gas is 
perturbed from the equilibrium form described by eq.(III.56), and follow its relaxation to 
equilibrium. There is a hierarchy of mechanisms that operate at different time scales. 

(i) The fastest processes are the two body collisions of particles in immediate vicinity. 
Over a time scale of the order of T., fo(Gi, @, t) relaxes to fi (qi, t) fi(@, t) for separa- 
tions |g — q| > d. Similar relaxations occur for the higher order densities f.. 

(ii) At the next stage, f; relaxes to a local equilibrium from, as in eq.(III.53), over the 
time scale of the mean free time T,. This is the intrinsic scale set by the collision term 
on the right hand side of the Boltzmann equation. After this time interval, quantities 
conserved in collisions achieve a state of local equilibrium. We can then define at each 


point a (time dependent) local density by integrating over all momenta as 


(et) = i aif. (84,2), (111.69) 


as well as a local expectation value for any operator O(p, 7, t) 


1 


(O(g,t)) = Ala) 


/ @pf(9.7,t)OW,G.t). (111.70) 


(iii) After the densities and expectation values have relaxed to their local equilibrium forms 
in the intrinsic time scales 7, and 7,., there is a subsequent slower relaxation to global 
equilibrium over extrinsic time and length scales. This final stage is governed by the 
smaller streaming terms on the left hand side of the Boltzmann equation. It is most 
conveniently expressed in terms of the time evolution of conserved quantities according 
to hydrodynamic equations. 


Conserved quantities are left unchanged by the two body collisions, i.e. satisfy 


X(P1, 9, t) + x(P2,.%t) = x(P1', 4, t) + x(P2", Gt), (III.71) 


where (pi, P2) and (p,', po’) refer to the momenta before and after a collision, respectively. 


For such quantities, we have 


Keys / Prx(7,7t) =) wee =0. (111.72) 


coll. 


64 


e Proof: Using the form of the collision integral, we have 
Jy = — | Ppa eB — a] [fi(Pi) fi (Pe) — firBi) fre) x(P1)- (III.73) 


(The implicit arguments (g}t) are left out for ease of notation.) We now perform the same 
set of changes of variables that were used in the proof of the H-theorem. The first step is 


averaging after exchange of the dummy variables p; and po, leading to 
1 > => Tio => > > > => > > 
Jy = 5 [endnci |e — 89] [fi (1) f(a) — fii) fr (B21 (x1) + x(H2)] . (ITL-74) 


Next, change variables from the originators (1, po, b), to the products (p1', Po’, b’) of the 


collision. After relabeling the integration variables, the above equation is transformed to 


1 Oy Mies te fat - - A wn = _ ae 
Jy = 7) [tra prdé la, — V9| [fi (21) f1(W2") — £11) fr @2)] [x (H1') + x2’) - 
(111.75) 
Averaging the last two equations leads to 
__ il 3 Bm r= ~ ~ ~ 1 — I 
J = -; fa Pid’ p2d°b |v; — v2| [fiPi) fi ®2) — Ar") f(p2")] 


Lx (1) + x(P2) — xP") — x(P2")], 


(III.76) 


which is zero from eq.(III.71). 

Let us explore the consequences of this result for the evolution of expectation values 
involving y. Substituting for the collision term in eq.(III.72) the streaming terms on the 
left hand side of the Boltzmann equation, leads to 


- oe a 0 
Jy (q, t) = f PRxG EY a+ PO, +R aD. 


= | AG Gt) =0, (111.7) 
Pa 


where we have introduced the notations 0; = 0/0t, 0g = 0/0da, and Fy, = —OU/0qa. We 


can manipulate the above equation into the form 
3 Pa 0 Pa ¢) 
d Pp Or + —O> + F,—— (x fi) — fi O; + —On + Fy— xX = 0. (III.78) 
m ODa m Da 


The third term is zero, as it is a complete derivative. Using the definition of expectation 


values in eq.(III.70), the remaining terms can be rearranged into 


On (n- (x)) + Oa (n (Fey)) — n(x) —n (a,x) — nF, (sx) = 0. (III.79) 


65 


As discussed earlier, for elastic collisions, there are 5 conserved quantities: particle 
number, the three components of momentum, and kinetic energy. Each leads to a corre- 
sponding hydrodynamic equation, as constructed below: 


(a) Particle number: Setting x = 1 in eq.(III.79) leads to 
On + Og, (NU) = 0, (III.80) 


where we have introduced the local velocity 


a= (4). (111.81) 


m 


This equation simply states that the time variation of the local particle density is due to 
a particle current ie =n: 
(b) Momentum: Any linear function of the momentum 7p’ is conserved in the collision, and 


we shall explore the consequences of the conservation of 


é= fl _¢. (II1.82) 
m 
Substituting cy into eq.(III.79) leads to 
Fa 
Og (n ((ug + Cg) Ca)) + NOpte + NOBUe (ug + cg) — = 0. (III.83) 


Taking advantage of (c.) = 0, from eqs.(III.81) and (III.82), leads to 


F 1 
Osta Ontigi 2g P oak II1.84 
(Ua + UgOgU Bar ae OP nae ( ) 


where we have introduced the pressure tensor, 
Pog = Mn (CoC) - (III.85) 


The left hand side of the equation is the acceleration of an element of the fluid diu/dt, 
which should equal Fos /m according to Newton’s equation. The net force includes an 
additional component due to the variations in the pressure tensor across the fluid. 


(c) Kinetic energy: We first introduce an average local kinetic energy 


mec p mu2 
yy Lee ee ae eer ae IIL.86 
‘ ( 2 ) (F ee ) ( ) 


66 


and then examine the conservation law obtained by setting y equal to mc? /2 in eq.(III.79). 


Since for space and time derivatives de = mcgOcg = —mcgOug, we obtain 


O,(ne) + Oa (n ( (to if a) +nmdzug (eg) +nmdqg (ta + Ca)eg) —NF a (ca) = 0. 


(11.87) 


Taking advantage of (c,) = 0, the above equation is simplified to 
me? 
On(neé) + Oa (NUaé) + Oa (n (c=) + Pyg0atg = 0. (III.88) 
We next take out the dependence on n in the first two terms of the above equation, finding 
EON + NOE + EOgq (NUg) + NUQGIdE + Oaha + Papas = 9, (111.89) 


where we have also introduced the local heat flux 


R= — (cac*), (IIT.90) 
and the rate of strain tensor 
1 
Uap = 5 (Ogug + Ogua) - (III.91) 


Eliminating the first and third terms in eq.(III.89) with the aid of eq.(III.80) leads to 
1 1 
Ore + UgOné = ——Ogha - — abla: (III.92) 
n n 


Clearly to solve the hydrodynamic equations for n, uw, and €, we need expressions for Pyg 
and h, which are either given phenomenologically, or calculated from the density f1, as in 


the next sections. 


67 


III.H_ Zeroth Order Hydrodynamics 


As a first approximation, we shall assume that in local equilibrium, the density f; at 


each point in space can be represented as in eq.(III.56), i.e. 


f, (B,,t) = (111.93) 


The choice of parameters clearly enforces [ d?p f? = n, and (p/m)° = U, as required. 


Average values are easily calculated from this Gaussian weight; in particular 


ia 
(cace)” = tet Sap ; (111.94) 


leading to 
3 
Pog =nkpT bag, and ¢= 5 keT. (111.95) 


Since the density f? is even in @, all odd expectation values vanish, and in particular 
h° =0. (111.96) 


The conservation laws in this approximation take the simple forms 
Din = —nd0quta 
it 
MD pte = Fo — Oa (nkpT) (III.97) 


2 
D,T = —=TIAQUe 
3 
In the above expression, we have introduced the material derivative 
D,= [Or + ugOg| ; (111.98) 


which measures the time variations of any quantity as it moves along the stream-lines set 
up by the average velocity field uv. By combining the first and third equations, it is easy 
to get 

D,In (are?! ) 50), (II.99) 


The quantity In (nT ba ) is like a local entropy for the gas (see eq. (III.67)), which according 
to the above equation is not changed along stream-lines. The zeroth order hydrodynamics 


thus predicts that the gas flow is adiabatic. This prevents the local equilibrium solution 


68 


of eq.(III.93) from reaching a true global equilibrium form which necessitates an increase 
in entropy. 

To demonstrate that eqs.(III.97) do not describe a satisfactory approach to equilib- 
rium, examine the evolution of small deformations about a stationary (tw) = 0) state, in a 


> 


uniform box (Ff = 0), by setting 


(III.100) 


We shall next expand eqs.(III.97) to first order in the deviations (v,6,v). Note that to 


lowest order, D; = 0;+O(u), leading to the linearized zeroth order hydrodynamic equations 


Ov = —NOgUa 
kpT 
MOtUa = ~—— av —kpd.0 (III.101) 


e Normal modes of the system are obtained by Fourier transformations, 
A G w) = [eae exp i (k- q— wt) | A(qg,t), (III.102) 


where A stands for any of the three fields (v,0,v). The natural vibration frequencies are 


solutions to the matrix equation 


V 0 nike 0 V 
Ww Ua = fel § akg ay FB § akg UB F (III.103) 
0 0 2T kg 0 0 


It is easy to check that this equation has the following modes, the first three with zero 

frequency: 

(a) Two modes describe shear flows in a uniform (n = 7) and isothermal (T' = T) fluid, in 
which the velocity varies along a direction normal to its orientation (e.g. d= f(x, t)y). 
In terms of Fourier modes k tip (k) = 0, indicating transverse flows that are not relaxed 
in this zeroth order approximation. 

(b) A third zero frequency mode describes a stationary fluid with uniform pressure P = 


nkpT. While n and T may vary across space, their product is constant, insuring 


69 


that the fluid will not start moving due to pressure variations. The corresponding 


eigenvector of eq.(III.103) is 
nN 
ve=|[ 0 |. (111.104) 
—T 
(c) Finally, the longitudinal velocity (i, || &) combines with density and temperature 


variations in eigenmodes of the form 


| 
ae ae with w(k) = +u,|kl, (IIT.105) 


[5 kpT 
ane peeetiaceaeees III.106 
@ 3m’ ( ) 


is the longitudinal sound velocity. Note that the density and temperature variations 


vi= { w( 


where 


in this mode are adiabatic, i.e. the local entropy (proportional to In (nT~3/2)) is left 

unchanged. 

We thus find that none of the conserved quantities relaxes to equilibrium in the zeroth 
order approximation. Shear flow and entropy modes persist forever, while the two sound 
modes have undamped oscillations. This is a deficiency of the zeroth order approximation 


which is removed by finding a better solution to the Boltzmann equation. 


III... First Order Hydrodynamics 


While f?(p, 7, t) of eq.(III.93) does set the right hand side of the Boltzmann equation 
to zero, it is not a full solution, as the left hand side causes its form to vary. The left hand 
side is a linear differential operator, which using the various notations introduced in the 


previous sections, can be written as 


Lifl= a + Pea, + Fa] f= LD. re ee f. (III.107) 
m ODa m OCo 


It is simpler to examine the effect of £ on In f?. which can be written as 


2 
3 
uae ~In(27mkzp). (III.108) 


In f? = In (nt-*?) ~ ket 2 


70 


Using the relation O(c?/2) = cgOcg = —cgdug, we get 


m 
iL [In fil =D, In (nr-/?) + Tr D; + ZapleP ita 
Se 2 (III.109) 
+ Oa 80a Spe eee ey Ie Fla 
Ca ee os a a7 m9) kava ercmmerrrae, £7 5 a = ° 
n 2T IkpT? lap hee 


If the fields n, T, and ua, satisfy the zeroth order hydrodynamic eqs.(III.97), we can 


simplify the above equation to 


2 
£ [In ff] =0 ~ aaa + 60 | ( Fa = et AE) + (ES) a 


3k_pT kepT on 7 a 2 Pp Ree 
mc? m 
+ DhpT? + keT CBU 


(III.110) 

The characteristic time scale ty for £ is extrinsic, and can be made much larger than 

Tx. The zeroth order result is thus exact in the limit (7, /ty) — 0; and corrections can be 
constructed in a perturbation series in (Tx /7y). To this purpose, we set f; = fP(1+4), 


and linearize the collision operator as 


Cn nice / d® Fyd?b |, — Fo] F2(B1) F9 (Be) [9(B1) + 9B») — 9(Br') — g (Be) 


=— f?(Pi)CzIgl- 
(IIT.111) 
While linear, the above integral operator is still difficult to manipulate in general. As a 
first approximation, and noting its characteristic magnitude, we set 
g 
Crig| = —. (III.112) 


Tx 


This is known as the single collision time approximation, and from the linearized Boltz- 


mann equation L{f,;] = —fPC |g], we obtain 
l 0 
g = -Tx Fe [fi] ¥ —TxL [In fP] , (11.113) 
I 


where we have kept only the leading term. Thus the first order solution is given by (using 
eq. (III.110)) 


lie O1~ > Tym Oaf 2 me" _ 5) Ca 
fi (p, 7, t) 1(D; q, t) kpT (< Ce 3 e ) a p TK (a 2 T 
(111.114) 


cl 


where T, = Tk = Tx in the single collision time approximation. However, in writing 
the above equation, we have anticipated the possibility of 7, 4 Tx which arises in more 
sophisticated treatments (although both times are still of order of 7T,.). 

It is easy to check that fd®pft = f d°pf? =n, and thus various local expectation 


values are calculated to first order as 


(oy == [ #OFQ(1 + 9) = (0)" + GO)”. (111.115) 


The calculation of averages over products of c,’s, distributed according to the Gaussian 
weight of f?, is greatly simplified by the use of Wick’s theorem, which states that expecta- 
tion value of the product is the sum over all possible products of paired expectation values, 


for example 
kpT \? 
(CalpCyC5) 5 = a (Sag6-+5 + bay985 + ba55By) ; (111.116) 


(Expectation values involving a product of an odd number of c,’s are zero by symmetry.) 


Using this result, it is easy to verify that 


0 
pene — OgT me? 5 _ 
( = ) = Ua —TK F aT 2 Cate). = Ue (111.117) 


The pressure tensor at first order is given by 


On 
(Cacg)” — CE (cae Cz — “s<2) ) bw 


da 
=nkpToag — 2nkpTT, ( _ “28 uy) . 


Pyg =nm (CC =nm 


(111.118) 


(Using the above result, we can further verify that «+ = (me? /2)" = 3kpT/2, as before.) 
Finally, the heat flux is given by 


me \" nmtK O3gT mc? 5 e 
hi = es Ne p el eee 
° n(c 2 ) 2 T (35 >) Ones 


2 
__SnkBT re 9 op 
2 m 


(11.119) 


At this order, we find that spatial variations in temperature generate a heat flow that 
tends to smooth them out, while shear flows are opposed by the off-diagonal terms in the 
pressure tensor. These effects are sufficient to cause relaxation to equilibrium, as can be 


seen by examining the modified behavior of the modes discussed previously. 


ie 


(a) The pressure tensor now has an off-diagonal term 
Page = —2nkeT Tog = —L (Oottg + Osta) , (III.120) 


where 4 = nkpT7,, is the viscosity coefficient. A shearing of the fluid (e.g. described 
by a velocity u,(x,t)) now leads to a a viscous force that opposes it (proportional to 
uOZuy), causing its diffusive relaxation as discussed below. 


(b) Similarly, a temperature gradient leads to a heat flux 
h = —KVT, (111.121) 


where K = (5nk}T7Ti¢)/(2m) is the coefficient of thermal conductivity of the gas. If 
the gas is at rest (@ = 0, and uniform P = nkpT), variations in temperature now 
satisfy 


2h 
3nkp 


3 
ndve = snkpOrT = —Oo (-K0.T), => OF = Vers (III.122) 


This is the Fourier equation and shows that temperature variations relax by diffusion. 
We can discuss the behavior of all the modes by linearizing the equations of motion. 


The first order contribution to Dita & OfUe, is 


1 1 
6} (O;Ua) = — 095" Pap ~) -_ Gazz + 80040 UB, (IIT.123) 


where p = nk plty. Similarly, the correction for D;T ~ 0,0, is given by 


2 2h 
5* (0,0) = —-—— Agha % — =I I09, 111.124 
( : ) 3k_Bn 3k pn ( ) 
with K = (57k?2,Tr)/(2m). After Fourier transformation, the matrix equation (ITI.103) 


is modified to 


ij 0 NOasks 0 * 
w ue) =| BP daaks ite (kdaa+ 38) Moaoke | {ug}. — (IIL125) 
: 0 3Tbapkp — 13 ‘ 


We can ask how the normal mode frequencies calculated in the zeroth order approximation 
are modified at this order. It is simple to verify that the transverse (shear) normal models 
(k - tip = 0) now have a frequency 


pepe. (111.126) 
mn 


73 


The imaginary frequency implies that these modes are damped over a characteristic 
time tr(k) ~ 1/|wr| ~ (A)?/(7,07), where A is the corresponding wavelength, and 
Dw JkpT/m is a typical gas particle velocity. We see that the characteristic time scales 
grow as the square of the wavelength, which is characteristic of diffusive processes. 


In the remaining normal modes the velocity is parallel to k, and eq.(III.125) reduces 


to 
7 0. Tik 0 . 
wlu|=| Se + fee ug (111.127) 
6 0 27h |= 42K 0 
3 3kpn 


The determinant of the dynamical matrix is the product of the three eigen-frequencies, 


and to lowest order is given by 


At zeroth order the two sound modes have w 


isobaric mode is det(M) 
et 
O° Lae 


At first order, the longitudinal sound modes 
1 


quencies w 


the trace of the dynamical matrix is equal to 


1 (k) = tupk — ik? ( 


— -nk 


0 


= 


2p 
3m 


kpTk 


— + O(72). (III.128) 
mn 


(k) = tuek, and hence the frequency of the 


2Kk? 
5kpn 


+ O(r2 


ys (111.129) 


also turn into damped oscillations with fre- 


(k) = tuck — iy. The simplest way to obtain the decay rates is to note that 


the sum of the eigenvalues, and hence 


) +06 


2K 
nM 15kpn 


(11.130) 


The damping of all normal modes guarantees the, albeit slow, approach of the gas to its 


final uniform and stationary equilibrium state. 


74 


IV. Classical Statistical Mechanics 


IV.A General Definitions 


e Statistical Mechanics is a probabilistic approach to equilibrium macroscopic proper- 
ties of large numbers of degrees of freedom. 

As discussed in chapter I, equilibrium properties of macroscopic bodies are phe- 
nomenologically described by the laws of thermodynamics. The macro-state M, depends 
on a relatively small number of thermodynamic coordinates. To provide a more funda- 
mental derivation of these properties, we can examine the dynamics of the many degrees 
of freedom N, comprising a macroscopic body. Description of each micro-state fu, requires 
an enormous amount of information, and the corresponding time evolution, governed by 
the Hamiltonian equations (discussed in chapter II), is usually quite complicated. Rather 
than following the evolution of an individual (pure) micro-state, statistical mechanics ex- 
amines an ensemble of micro-states corresponding to a given (mixed) macro-state. It aims 
to provide the probabilities pj, (4), for the equilibrium ensemble. Liouville’s theorem jus- 
tifies the assumption that all accessible micro-states are equally likely in an equilibrium 
ensemble. As discussed in chapter III, such assignment of probabilities is subjective. In 
this chapter we shall provide unbiased estimates of pj (4) for a number of different equi- 
librium ensembles. A central conclusion is that in the thermodynamic limit of large N all 
these ensembles are in fact equivalent. In contrast to kinetic theory, equilibrium statistical 


mechanics leaves out the question of how various systems achieve equilibrium. 


IV.B The Microcanonical Ensemble 


Our starting point in thermodynamics is a mechanically and adiabatically isolated 
system. In the absence of heat or work input to the system, the internal energy EF, and 
the generalized coordinates x, are fixed, specifying a macro-state M = (F,x). The corre- 
sponding set of mixed micro-states form the microcanonical ensemble. In classical statisti- 
cal mechanics, these microstates are defined by points in phase space, their time evolution 
governed by a Hamiltonian 7{(yw), as discussed in Chapter II. Since the Hamiltonian equa- 
tions (II.1) conserve the total energy of a given system, all micro-states are confined to the 


surface H(z) = E in phase space. Assume that there are no other conserved quantities, so 


72 


that all points on this surface are mutually accessible. The central postulate of statistical 


mechanics is that the equilibrium probability distribution is given by 


(IV.1) 


1 1 for H(u)=E 
P(B,x) (H) = Q(B, x) . 


0 otherwise 


Some remarks and clarification on the above postulate are in order: 

Boltzmann’s assumption of equal a priori equilibrium probabilities refers to the above 
postulate, which is in fact the unbiased probability estimate in phase space subject 
to the constraint of constant energy. This assignment is consistent with, but not 
required by, Liouville’s theorem. Note that the phase space, specifying the micro- 
states 4, must be composed of canonically conjugate pairs. Under a canonical change 
of variables, ys > py’, volumes in phase space are left invariant. The Jacobian of such 
transformations is unity, and the transformed probability, p(y’) = p(w) |Ou/Op'|, is 
again uniform on the surface of constant energy. 

The normalization factor Q(E,x), is the area of the surface of constant energy EF 
in phase space. To avoid subtleties associated with densities that are non-zero only 
a surface, is sometimes more convenient to define the microcanonical ensemble by 
requiring FE -A< H(yu) < E+A, i.e. assigning the energy of the ensemble up to an 
uncertainty of A. In this case, the accessible phase space forms a shell of thickness 
A around the surface of energy E. The normalization is now the volume of the shell, 
Q! = 2AQ. Since Q typically depends exponentially on E, as long as A ~ O(E°) (or 
even O(E')), the difference between the surface and volume of the shell is negligible 
in the E x N — & limit, and we shall use Q and 1’ interchangeably. 


The entropy of this uniform probability distribution is given by 
S(B,x) = kp ln Q(E,x). (IV.2) 


An additional factor of kg is introduced compared to the definition of eq.(III.68), so 
that the entropy has the correct dimensions of energy per degrees Kelvin, used in 
thermodynamics. Q“ and S are not changed by a canonical change of coordinates in 
phase space. For a collection of independent systems, the overall allowed phase space 
is the product of individual ones, i.e. Ototar = []; Qi. The resulting entropy is thus 
additive, as expected for an extensive quantity. 


Various results in thermodynamics now follow from eq.(IV.1), provided that we con- 


sider macroscopic systems with many degrees of freedom. 


73 


e The Zeroth law: Equilibrium properties are discussed in thermodynamics by placing 
two previously isolated systems in contact, and allowing them to exchange heat. We 
can similarly bring together two microcanonical systems, and allowing them to exchange 
energy, but not work. If the original systems have energies £; and EF» respectively, the 
combined system has energy £ = E, + Ey. Assuming that interactions between the two 
parts are small, each micro-state of the joint system corresponds to a pair of micro-states 
of the two components, i.e. W = [1 @ fe, and H(p1 @ pe) = Hi(1) + He(p2). As the joint 


system is in a microcanonical ensemble of energy & = E, + Eg, in equilibrium 


1 for Hi(p1) + He(w2) = E 
PE(H1 ® p2) = >: : IV.3 
7 ; ae) 0 otherwise 


Since only the overall energy is fixed, the total allowed phase space is computed from 


S)(E1) + So(E — E;) 


= (IV.4) 


Q(B) = [ tr2(2)92(E - By) = pe exp 


The properties of the two systems in the new joint equilibrium state are implicit in 
eq.(IV.3). We can make them explicit by examining the entropy that follows from eq.(IV.4). 
Extensivity of entropies suggests that S; and S2, are proportional to the number of particles 
in the systems, making the integrand in eq.(IV.4) an exponentially large quantity. Hence 
the integral can be equated by the saddle point method to the maximum value of the 


integrand, obtained for energies Ej and EL} = E — Ej, i.e. 
S(E) = ken Q(E) & S1(E7T) + So(E5). (IV.5) 


The position of the maximum is obtained by extremizing the exponent in eq.(IV.4) with 


respect to £1, resulting in the condition, 


5, 
OF, 


OS, 


x1 x2 


Although all joint micro-states are equally likely, the above results indicate that there 
are an exponentially larger number of states in the vicinity of (E{, E>). Originally, the 
joint system starts in the vicinity of the point (£1, £2). After the exchange of energy takes 
place, the combined system explores a whole set of new micro-states. The probabilistic 
arguments provide no information on the dynamics of evolution amongst these micro- 


states, or on the amount of time needed to establish equilibrium. However, once sufficient 


74 


time has elapsed so that the assumption of equal a priori probabilities is again valid, the 
system is overwhelmingly likely to be at a state with internal energies (Ef, E>). At this 
equilibrium point, condition (IV.6) is satisfied, specifying a relation between two functions 
of state. These state functions are thus equivalent to empirical temperatures, and indeed, 


consistent with the fundamental result of thermodynamics, we have 


os 
OE 


1 
==. IV. 
aa (IV.7) 


e The first law: We next inquire about the variations of S(-,x) with x, by changing the 
coordinates by 6x. This results in doing work on the system by an amount dW = J - ox, 
and changes the internal energy to EF + J- 6x. The first order change in entropy is given 


by 
as 
OE 


Os 
ae oe 


x 


6S = S(E+J-6x,x+ 6x) —S(E,x) = ( 


) “Ox. (IV.8) 


This change will occur spontaneously, taking the system into a more probable state, unless 


the quantity in brackets in zero. Using eq.(IV.7), this allows us to identify the derivatives 


OS J 
=——, (IV.9) 
Ox; E054: T 
Having thus identified all variations of 5, we have 
E ; 
as( fx) = “ ae — dE = TdS + J - dx, (IV.10) 


allowing us to identify the heat input dQ = TdS. 

e The second law: Clearly, the above statistical definition of equilibrium rests on the 
presence of many degrees of freedom N > 1, which make it exponentially unlikely in N, 
that the combined systems is found with component energies different from (E{, E3). By 
this construction, the equilibrium point has a larger number of accessible states than the 


starting point, i.e. 
O1 (Ef, X1)Q2 (E53, x2) ae Q1 (Fy, X1)Qo( Fo, X2). (IV.11) 


In the process of evolving to the more likely (and more densely populated) regions, there 


is an irreversible loss of information, accompanied by an increase in entropy, 


6S = S1(ET) + So(E5) = S1(E1) — So(E2) > 0, (IV.12) 


79 


as required by the second law of thermodynamics. When the two bodies are first brought 


into contact, the equality in eq.(I[V.6) does not hold. The change in entropy is such that 


AS, oe 
= Bp (22 ES IV.1 
5S ( _) 1 (7 =)é 1=0, (IV.13) 


OE, 
i.e. heat (energy) flows from the hotter to the colder body, as in Clausius’s statement of 


_ 082 
OE» 


x1 


the second law. 
e Stability conditions: Since the point (Ef, £3) is a maximum, the second derivative of 
Si (£1) + S2(£2) must be negative at this point, i.e. 
073, 
OE? 


0785 


| <0); 
TE 0 (IV.14) 


x1 X2 


Applying the above condition to two parts of the same system, the condition of thermal 
stability, C,. > 0, as discussed in section (II), is regained. Similarly, the second order 
changes in eq.(IV.8) must be negative, requiring that the matrtix 0°S/0x;0x; | jp be positive 
definite. 


IV.C Two-Level Systems 


Consider N impurity atoms trapped in a solid matrix. Each impurity can be in one 
of two states, with energies 0 and € respectively. This example is somewhat different from 
the situations considered so far, in that the allowed micro-states are discrete. Liouville’s 
theorem applies to Hamiltonian evolution in a continuous phase space. Although, there is 
less ambiguity in enumeration of discrete states, the dynamics that ensures that all allowed 
micro-states are equally accessed will remain unspecified for the moment. (An example 
from quantum mechanical evolution will be presented later on.) 

The micro-states of the two level system are specified by the set of occupation numbers 
{ni}, where n; = 0 or 1 depending on whether the i" impurity is in its ground state or 


excited. The overall energy is 


N 
H ({ni}) = dni =e«M, (IV.15) 


where N, is the total number of excited impurities. The macro-state is specified by the 


total energy FE, and the number of impurities N. The microcanonical probability is thus 


PD) = EEL we (v.16) 


76 


As there are N, = E/e excited impurities, the normalization 2 is the number of ways of 


choosing Nj, excited levels among the available N, and given by the binomial coefficient 


N! 
Q(E, N) = ———___.. IV.17 
The entropy 
S(E,N) = kpl aL IV.18 
(£,N)=keln M(N— Ni)! ) (IV.18) 
can be simplified by Stirling’s formula in the limit of N,, N > 1 to 
Ni, Ny N-N, N-N, 
S(E, N) ~ —Nkp pace + Se 
EB 7 - - (IV.19) 
The equilibrium temperature can now be calculated from eq.(IV.7) as 
1 OS kp E 
— = | =——In| ——]}. IV.2 
i asl, : a (ae—5) oe) 
Alternatively, the internal energy at a temperature T’, is given by 
E(T) = als (IV.21) 


exp (<r) +1 


The internal energy is a monotonic function of temperature, increasing from a minimum 
value of 0 at T = 0 to a maximum value of Ne/2 at infinite temperature. It is, however, 
possible to start with energies larger than Ne/2, which correspond to negative temperatures 
from eq.(IV.20). The origin of the negative temperature is the decrease in the number of 
microstates with increasing energy, the opposite of what happens in most systems. Two 
level systems have an upper bound on their energy, and very few microstates close to this 
maximal energy. Hence increased energy leads to more order in the system. However, 
once a negative temperature system is brought into contact with the rest of the universe 
(or any portion of it without an upper bound in energy), it loses its excess energy and 
comes to equilibrium at a positive temperature. The world of negative temperatures is 
quite unusual in that systems can be cooled by adding heat, and heated by removing it. 
There are physical examples of systems temporarily prepared at a metastable equilibrium 


of negative temperature in lasers, and for magnetic spins. 


77 


The heat capacity of the system, given by 


pte EY gst Nas (8 a (IV.22) 
ie Prep EA ern P\ ket 


vanishes at both low and high temperatures. The vanishing of C as exp (—e/kgT) at low 
temperatures is characteristic of all systems with an energy gap separating the ground state 
and lowest excited states. The vanishing of C at high temperatures is a saturation effect, 
common to systems with a maximum in the number of states as a function of energy. In 
between, the heat capacity exhibits a peak at a characteristic temperature of T. « €/kp. 
Statistical mechanics provides much more than just macroscopic quantities such as 
energy and heat capacity. Eq.(IV.16) is a complete joint probability distribution with 
considerable information on the micro-states. For example, the unconditional probability 
for exciting a particular impurity is obtained from 
pim)= SO  p({ny) = Se. (1V.23) 


The second equality is obtained by noting that once the energy taken by the first impurity 
is specified, the remaining energy must be distributed among the other N — 1 impurities. 
Using eq.(IV.17), 


Q(E, N — 1) cgomk Ni! (N — N;)! N 
= 0) Ty MSS OV 


and p(n; = 1) =1— p(n, = 0) = N,/N. Using N; = E/e, and eq.(IV.21), the occupation 


probabilities at a temperature T' are 


(IV.25) 


78 


IV.D The Ideal Gas 


As discussed in chapter II, micro-states of a gas of N particles correspond to points pp = 
{pi, G@}, in the 6N-dimensional phase space. Ignoring the potential energy of interactions, 


the particles are subject to a Hamiltonian 


H = > a ue ua) (IV.26) 


where U(q) describes the potential imposed by a box of volume V. A microcanonical 
ensemble is specified by its energy, volume, and number of particles, M = (E,V,N). The 


joint PDF for a micro-state is 


l 1 for g € box, and 9°, p;?/2m=E (+Az) 
pW) = aE v.27) 


E N 
a) 0 otherwise 


In the allowed micro-states, coordinates of the particles must be within the box, while 
the momenta are constrained to the surface of the (hyper-)sphere oy pi? = 2mE. The 
allowed phase space is thus the product of a contribution V“ from the coordinates, with 
the surface area of a 3N-dimensional sphere of radius /2mE from the momenta. (If the 
microstate energies are accepted in the energy interval £ + Ag, the corresponding volume 
in momentum space is that of a (hyper-)spherical shell of thickness Ap = \/2m/EAE.) 
The area of a d-dimensional sphere is Ag = SqgR¢~!, where Sq is the generalized solid 
angle. 


A simple way to calculate the d—dimensional solid angle is to consider the product of 


00 d 
I,= € ine) = 4/2, (IV.28) 


Alternatively, we may consider Jz as an integral over an entire d—dimensional space, i.e. 


d Gaussian integrals, 


d 
i= [Tee exp (—2?). (IV.29) 
i=1 


The integrand is spherically symmetric, and we can change coordinates to R? = >>, x?. 


Noting that the corresponding volume element in these coordinates is dVz = SgR¢'dR, 
_ - d-1,—R? _ Sa ae Si ay Sa 
Li dRSgR* “e =e dyy f=, (d/2—1)!, (IV.30) 
0 0 


79 


where we have first made a change of variables to y = R?, and then used the integral 
representation of n!. Equating expressions (IV.28) and (IV.30) for Iq gives the final result 
for the solid angle, 


D7 4/2 
oh IV.31 
(aby v.81) 
The volume of the available phase space is thus given by 
D7 3N/2 
Q(E,V,N) =V™ -(2mE)CN-V/2 Ap, (IV.32) 


(3N/2—1)! 
The entropy is obtained from the logarithm of the above expression. Using Stirling’s 
formula, and neglecting terms of order of 1 or nE ~ InN in the large N limit, results in 


N N N N 
S(E,V,N) =kp lvinv + “™ in(2rmB) — in + ou 


2 2 
IV.33 
_wkpin |v (4temE\"" en 
rare 3N 
Properties of the ideal gas can now be recovered from TdS = dE + PdV — pdN, 
1 OS 3 Nk 
— = = = (IV.34) 


POF ees Oe 
The internal energy F = 3NkgT/2, is only a function of T, and the heat capacity Cy = 


3Nkp/2, is a constant. The equation of state is obtained from 


Pas Nkp 


Boe) eee PV = NkpT. IV.35 
T a ee vy? 2 ( ) 


The unconditional probability of finding a particle of momentum p} in the gas can be 


calculated from the joint PDF in eq.(IV.27), by integrating over all other variables, 


N 
oir) = f aa [LP aa pv({a.7)) 
i=2 (IV.36) 
_VQ(E — pi 2/2m, V, N —1) 
7 O(E, V,N) , 
The final expression indicates that once the kinetic energy of one particle is specified, the 


remaining energy must be shared amongst the other N — 1. Using eq.(IV.32), 


7 VA YP Ome =, 2\eNaD 2 (3N/2— 1)! 
PP) = ary aya)! WNaSN72(ImEVGN-DPB 
4p BN BN /2S8 (IV.37) 
=(1- Pi ) 1 (3N/2—1)! 
2nE (2rmE)3/? (3(N — 1)/2 —1)! 


80 


From Stirling’s formula, the ratio of (3N/2— 1)! to (3(N — 1)/2 — 1)! is approximately 
(3N/2)3/2, and in the large E limit, 


P(P1) = ( sae exp (-> Pie ) (IV.38) 


This is a properly normalized Maxwell-Boltzmann distribution, which can be displayed in 


its more familiar form after the substitution E = 3NkgT/2, 


as 1 Pi? 
P(pPi) = QamkpT 3? exp (-4,) . (IV.39) 


IV.E Mixing Entropy and Gibbs’ Paradox 


The expression in eq.(IV.33) for the entropy of the ideal gas has a major shortcoming in 
that it is not extensive. Under the transformation (EF, V,N) — (AE, AV, AN), the entropy 
changes to A(S + NkglndA). The additional term comes from the contribution V, of the 
coordinates to the available phase space. This difficulty is intimately related to the mixing 
entropy of two gases. Consider two distinct gases, initially occupying volumes V; and V2 
at the same temperature 7’. The partition between them is removed, and they are allowed 
to expand and occupy the combined volume V = V; + V3. The mixing process is clearly 
irreversible, and must be accompanied by an increase in entropy, calculated as follows. 


According to eq.(IV.33), the initial entropy is 


Sj = Si + So = Nikp(IinVy + 01) + Nokp(In V2 + 02), (IV.40) 
where, 
Anema Ea \?/? 
a =l = ; IV.41 
o n( 3 ] (IV.41) 


is the momentum contribution to the entropy of the a” gas. Since Ey/Nq = 3kgT/2 for 
a monotonic gas, 
3 
aAT) = 5 In (2t7emgkpT). (IV.42) 


The temperature of the gas is unchanged by mixing, since 


2 | os ee 5 a 
ap eA egal gece PS IV.4 
Ne aONG NG NG or TvAS) 


The final entropy of the mixed gas is 
Sr = Nike In(Vy + V2) + Nokp In(Vy + V2) + kp(Mio1 + N202). (IV.44) 


There is no change in the contribution from the momenta which depends only on temper- 


ature. The mixing entropy, 


ASmix = S¢ — 5; = Nike In + Nokp In = —Nkg ind + ine , (IV.45) 
is solely from the contribution of the coordinates. The above expression is easily generalized 
to the mixing of many components, with ASwix = —Nkp >>, (Na/N) In(Va/V). 

Gibbs’ Paradoz is related to what happens when the two gases, initially on the two 
sides of the partition, are identical with the same density, n = Ni/Vi = N2/V2. Since 
removing or inserting the partition does not change the state of the system, there should 
be no entropy of mixing, while eq.(IV.45) does predict such a change. For the resolution of 
this paradox, note that while after removing and reinserting the partition, the system does 
return to its initial configuration, the actual particles that occupy the two components 
are not the same. But as the particles are by assumption identical, these configurations 
cannot be distinguished. In other words, while the exchange of distinct particles leads to 
two configurations 

e | o o | e 
A |B B and AB B’ 
a similar exchange has no effect on identical particles, as in 
e | e e | e 
ASW and ie 

Therefore, we have over-counted the phase space associated with N identical parti- 

cles by the number of possible permutations. As there are N! permutations leading to 


indistinguishable micro-states, eq.(IV.32) should be corrected to 


VN 27 3N/2 


OE MS ae (3N/2—1)! 


(QmE)BN-V/P2Ap, (IV.46) 
resulting in a modified entropy, 


S=kgnQ=kse|NInV-NInN+NInel]|+ Nkgo = Nkg nF +o .  (IV.47) 


82 


As the argument of the logarithm has changed from V to V/N, the final expression is now 
properly extensive. The mixing entropies can be recalculated using eq.(IV.47). For the 
mixing of distinct gases, 


V V Vi Vo 
ASmix = —$§,;=N In — + WN. In —— — N- In — — WN. In — 
Su Sr S. ike Pe + Nokp NG ike i kp nx, 


VM V No 
= Nikpl Nokpl 
ikaln (5 a) + akyin (5 =) (IV .48) 


Ni, Vi No, Ve 
=—WN —In—+——lIn 
Np N V NV aE 
exactly as obtained before in eq.(IV.45). For the ‘mixing’ of two identical gases, with 


N/V, = N2/V2 = (Ni + No2)/(Vi + V2), 


V, 
Migr Ves a Nikp In way i Nokp In V2 = 0. (IV.49) 


ASmix = Sp — S; = (Ni + No)kpl 
i : i 2)ke In 5 ON Ny No 


Note that after taking the permutations of identical particles into account, the available 
volume in the final state is V“1*+2/N,!No! for distinct particles, and V%1t+42/(N, + No)! 
for identical particles. 


e Additional comments on the microcanonical entropy: 


1. In the example of two-level impurities in a solid matrix (sec.IV.C), there is no need for 


the additional factor of N!, as the defects can be distinguished by their locations. 


2. The corrected formula for the ideal gas entropy in eq.(IV.47) does not affect the com- 
putations of energy and pressure in eqs.(IV.34) and (IV.35). It is essential to obtaining an 


intensive chemical potential, 


as S V (4nmE\?”? 
a= ON| py 7 WTB Lae (=) (IV.50) 


N\ 3N 


3. The above treatment of identical particles is somewhat artificial. This is because 
the concept of identical particles does not easily fit within the framework of classical 
mechanics. To implement the Hamiltonian equations of motion on a computer, one has to 
keep track of the coordinates of the N particles. The computer will have no difficulty in 
distinguishing exchanged particles. The indistinguishability of their phase spaces is in a 
sense an additional postulate of classical statistical mechanics. This problem is elegantly 
resolved within the framework of quantum statistical mechanics. Description of identical 


particles in quantum mechanics requires proper symmetrization of the wave function. The 


83 


corresponding quantum microstates naturally yield the N! factor, as will be shown later 


on. 


A. Yet another difficulty with the expression (IV.47), resolved in quantum statistical me- 
chanics, is the arbitrary constant that appears in changing the units of measurement for 
q and p. The volume of phase space involves products pq, of coordinates and conjugate 
momenta, and hence has dimensions of (action). Quantum mechanics provides the ap- 
propriate measure of action in Planck’s constant h. Anticipating these quantum results, 


we shall henceforth set the measure of phase space for identical particles to 


N 


IV.F The Canonical Ensemble 


In the microcanonical ensemble, the energy EF, of a large macroscopic system is pre- 
cisely specified, and its equilibrium temperature T’, emerges as a consequence (eq.(IV.7)). 
However, from a thermodynamic perspective, E and T are both functions of state and on 
the same footing. It is possible to construct a statistical mechanical formulation in which 
the temperature of the system is specified and its internal energy is then deduced. This is 
achieved in the canonical ensemble where the macro-states, specified by M = (T,x), allow 
the input of heat into the system, but no external work. The system S, is maintained 
at a constant temperature through contact with a reservoir R. The reservoir is another 
macroscopic system that is sufficiently large so that its temperature is not changed due 
to interactions with S. To find the probabilities pyr) (1), of the various micro-states of 
S, note that the combined system R ® S, belongs to a microcanonical ensemble of energy 


Etot > Es. As in eq.(IV.3), the joint probability of micro-states (1g @ up) is 


1 1 for Hs(us) + Hr(urR) = Erot 
P(Hs ® UR) = Osanna (IV.52) 
Pe ee 0 otherwise 
The unconditional probability for micro-states of S is now obtained from 
p(us) = >_ plus @ pr) - (IV.53) 


{ur} 


84 


Once jug is specified, the above sum is restricted to micro-states of the reservoir with energy 

Etot — Hs(us). The number of the such states is related to the entropy of the reservoir, 

and leads to 

QR (Etot — Hs(us)) 
Oser(Etot) 


Since by assumption the energy of the system is insignificant compared to that of the 


1 
Plus) = x exp kp (Eto: — Hs(us)) | - (IV.54) 


reservoir, 
OS H 
Sr(Brot — Hs(us)) © Sx(Etot) — Hs(us) =—~ = Sr(Erot) — sls) (IV.55) 
OER e 
Dropping the subscript $, the normalized probabilities are given by 
oe BH(n) 7 
P(T,x) (u) = Gia (IV.56) 
The normalization, 
Z(D,x) = y- e BW), (IV.57) 
{ut 


is known as the partition function, and 6 = 1/kpT. (Note that probabilities similar to 
eq.(IV.56) were already obtained in eqs.(IV.25), and (IV.39), when considering a portion 
of the system in equilibrium with the rest of it.) 

Is the internal energy E, of the system S well defined? Unlike in a microcanonical 
ensemble, the energy of a system exchanging heat with a reservoir is a random variable. 
Its probability distribution p(€), is obtained by changing variables from yz to H(yw) in p(1), 


resulting in 


BE 
p(E) = >) (1) 6 (H(u) — €) = — 15 (Hu) -€) - (IV.58) 
{pu} {Hu} 
Since the restricted sum is just the number ((€), of micro-states of appropriate energy, 
QE)ePE 1. S(E) € 1 F(€) 
= ———.—_ = s | = ———_ IV. 
pe) Z Z| ke: keh oe | kee | 2 ey) 


where we have set F = € — TS(E€), in anticipation of its relation to the Helmholtz free 
energy. The probability p(€), is sharply peaked at a most probable energy E*, which 


minimizes F'(€). Using the result in sec.(II.F) for sums over exponentials, 
Z = Se PRO) = Se PPO) we @ AFUE), (IV.60) 
{u} € 


85 


The average energy computed from the distribution in eq.(IV.59) is 


(H) = 2A) — Soo ee, (IV.61) 


In thermodynamics, a similar expression was encountered for the energy (eq.(I.37)), 


_ a OF |) - x ee OPEN: VOUWOE) 
B=F+TS=F-T Fr) =-15 (7) - Ae. (IV.62) 
Eqs.(IV.60) and (IV.61), both suggest identifying 
F(T, x) = —kgT In Z(T,x). (IV.63) 


However, note that eq.(IV.60) refers to the most likely energy, while the average energy 
appears in eq.(I[V.61). How close are these two values of the energy? We can get an idea 
of the width of the probability distribution p(€), by computing the variance (H?),.. This is 
most easily accomplished by noting that Z7(3) is proportional to the characteristic function 


for H (with @ replacing ik) and, 


OZ 27, 
ag = ) He BM, and = - y Hee OH, (IV .64) 
u “ 
Cumulants of H are generated by In Z({), 
1 10Z OlnZ 
(H)c Z : He Z Op ag” (IV.65) 


and 


> ae Ee: 2_ 1 2-6H 1 _BH PP InZ O(H) 
(12). = (2) = (HW)? = 5 owe? ~p (Dre _PinZ _ 


ae Ope OB 
(IV.66) 
More generally, the n** cumulant of H is given by 
Oo" lnZ 
H"). = (-1)" —_. IV. 
(H"). = (IPE (IV.67) 


86 


From eq.(IV.66), 


OCH) (H\ = kaT?Cy, (IV.68) 


where we have identified the heat capacity with the thermal derivative of the average 
energy (H). Eq.(IV.68) shows that it is justified to treat the mean and most likely energies 
interchangeably, since the width of the distribution p(E), only grows as ,/(H?). « N12. 


The relative error, ,/(H?)./(H). vanishes in the thermodynamic limit as 1/N. (In fact 
eq.(IV.67) shows that all cumulants of H are proportional to N.) The PDF for energy in 


a canonical ensemble can thus be approximated by 


2 

p(E) = Je Pr) ~ exp (-S=h) ee (IV.69) 
The above distribution is sufficiently sharp to make the internal energy in a canonical 
ensemble unambiguous in the N — oo limit. Some care is necessary if the heat capacity 
C,, is divergent, as is the case in some continuous phase transitions. 

The canonical probabilities in eq.([V.56) are unbiased estimates obtained (as in 
sec.(II.G)) by constraining the average energy. The entropy of the canonical ensemble 
can also be calculated directly from eq.(IV.56) (using eq.(III.68)) as 

$= —kp (np(u)) = ~kp (-6H ~ In) = ==, (IV.70) 
again using the identification of In Z with the free energy from eq.(IV.63). For any finite 
system, the canonical and microcanonical probabilities are distinct. However, in the so 
called thermodynamic limit of N — oo limit, the canonical probabilities are so sharply 
peaked around the average energy that they are essentially indistinct from microcanonical 
probabilities at that energy. The following table compares the prescriptions used in the 


two ensembles. 


pea =n 5(Ex) = kpino 


anaes Ta (= LPT a Sn? 


Table 3: ree of canonical and microcanonical ensembles. 


87 


IV.G Examples 


The two examples of sections (IV.C) and (IV.D) are now reexamined in the canonical 
ensemble. 
1. Two level systems: The N impurities are described by a macro-state M = (T,N). 
Subject to the Hamiltonian H = € Bey n,;, the canonical probabilities of the micro-states 


pu = {n;}, are given by 


p({ni}) = 5 exp 


N 
Hem (IV.71) 


From the partition function, 


Z(P, N) = So exp sen z [> te) - ( 3 come) 


{ni} ni=0 nn=0 (IV.72) 
= (1+6%)", 
we obtain the free energy 
F(T, N) = —kpTInZ = —NkpT In 1 seg =e] ee) (IV.73) 
The entropy is now given by 
OF € e—/(keT) 
=— | =Nkpl 1 see) NkpT | — ) ————... IV.74 
S OT | y kpln|1+e +Nkp keT? | [pen aT) (IV.74) 
a 
—F/T 
The internal energy, < 
€ 
SE OS aa eneny (IV.75) 
can also be obtained from ite . “neta 
| eee tency ie (IV.76) 


0B = 1+e-8e* 
Since the joint probability in eq.(IV.71) is in the form of a product, the excitations of 
different impurities are independent of each other, with the unconditional distribution 
e Ben 


p(n) = (IV.77) 


This result coincides with eqs.(IV.25), obtained through a more elaborate analysis in the 
microcanonical ensemble. As expected, in the large N limit, the canonical and microcanon- 
ical ensembles describe exactly the same physics, both at the macroscopic and microscopic 


levels. 


88 


2. The Ideal Gas: For the canonical macro-state M = (T,V,N), the joint PDF for the 


micro-states = {p;, gj}, is 
N p? 1 for {gq} € box 
P25 ot. (IV.78) 
2m 


0 otherwise 


P({pi, Git) = 5 exp 


Including the modifications to the phase space of identical particles in eq.(I[V.51), the 


dimensionless partition function is computed as 


d “aap Di “pe 
Z(T,V,N) - fale exp -ay oz 
i=l (IV.79) 
LUN mist. i NS 
~ N! h? “ONUAKEPY 2 
where ; 
MT) = ——., IV.80 
oe ar er, 20) 


is a characteristic length associated with the action h. It shall be demonstrated later on 
that this length scale controls the onset of quantum mechanical effects in an ideal gas. 


The free energy is given by 


N 2 fu 
F =—kpTnZ = —NkgT nV + NkpTmN — NkpT — a keT'In eS 
Ve 3 2rmk pl 
(IV.81) 
Various thermodynamic properties of the ideal gas can now be obtained from dF = —SdT— 
PdV + pdN. For example, from the entropy 
OF Ve. 2QrmkpT 3 F-E 

—sS= — =—N In — + = In | —— NkgTl — = — IV.82 
oe onles kp {In Fe + Sin( he }|- BP op = pt VB) 


we obtain the internal energy EF = 3NkpT/2. The equation of state is obtained from 


OF NkpT 
P=- — = a ee ee or IV.83 
OV Irn VV? aoe ( ) 
and the chemical potential is given by 
OF F E-TS+ PV 
=—/| =— T = ——_—_ = ka T In (nd3). IV.84 
BS ona ye N pT In (nd”) (IV.84) 


Also, according to eq.(I[V.78), the momenta of the N particles are taken from independent 
Maxwell—Boltzmann distributions, consistent with eq.(IV.39). 


89 


IV.H The Gibbs Canonical Ensemble 


We can also define a generalized canonical ensemble in which the internal energy 
changes by the addition of both heat and work. The macrostates M = (T, J), are specified 
in terms of the external temperature and forces acting on the system; the thermodynamic 
coordinates x appear as additional random variables. The system is maintained at constant 
force through external elements (e.g. pistons or magnets). Including the work done against 
the forces, the energy of the combined system that includes these elements is H — J- x. 
Note that while the work done on the system is +J -x, the energy change associated with 
the external elements with coordinates x has the opposite sign. The microstates of this 


combined system occur with the (canonical) probabilities 
P(ts, X) = €xp [—GH(us) + BI - x] /Z(T, N, J), (IV.85) 
with the Gibbs partition function, 


Z(N,P,3) = Seta), (IV.86) 


Ms ,Xx 


(Note that we have explicitly included the particle number N to indicate that there is no 
chemical work. Chemical work is considered in the Grand Canonical Ensemble, which is 
discussed next.) 


In this ensemble, the expectation value of the coordinates is obtained from 

(IV.87) 

which together with the thermodynamic identity x = —OG/0J, suggests the identification 
G(N,T, J) = —kpT ln Z, (IV.88) 


where G= E—TS—x-J is the Gibbs free energy. (The same conclusion can be reached 
by equating Z in eq.(IV.86) to the term that maximizes the probability with respect to 
x.) The enthalpy H = E —x.-J is easily obtained in this ensemble from 


Oln Z 
Sag (H—-x-J) =H. (IV.89) 


Note that heat capacities at constant force (which include work done against the external 
forces), are obtained from the enthalpy as Cy = 0H/OT. 


90 


The following examples illustrate the use of the Gibbs canonical ensemble: 


1. The Ideal Gas in the isobaric ensemble is described by the macrostate M = 


A micro-state = {pi, Gi}, with a volume V occurs with the probability 


N 9 
—B = _~ BPV 


0 


1 
P({Bi, Gi},V) = =z exp 


{ 1 for {g;} € box of volume V 


0 otherwise 


The normalization factor is now 


d?q; Pi 
z(n,7,P) =f dVe very | TL Sa — exp |— 
0 
ioe 1 
NIA(T)8N —(BPA(T)3)N- 


= [ dV Ve oP. 
0 


The Gibbs free energy is given by 


27m 


5 3 h? 
G = —kpTlnZ = NkpT InP— J In(keT) + 5 In . 


Starting from dG = —SdT' + VdP + wd, the volume of the gas is obtained as 


OG 


_ NkpT 
OP ae 


vz a —> PV =NkpT. 


The enthalpy H = (E + PV) is easily calculated from 


Oln Z = 5 ket. 


\ sa 
ap. 2 


from which we get Cp = dH/dT = 5/2Nkz. 


(N,T, P). 


. (IV.90) 


(IV.91) 


(IV.92) 


(IV.93) 


2. Spins in an external magnetic field B, provide a common example for usage of the 


Gibbs canonical ensemble. Adding the work done against the magnetic field to the internal 


Hamiltonian 7, results in the Gibbs partition function 


Z(N,T,B) =tr lexp (6H + BB- ™)| ; 


where M is the net magnetization. The symbol tr is used to indicate the sum over all 


spin degrees of freedom, which in a quantum mechanical formulation are restricted to 


discrete values. The simplest case is spin of 1/2, with two possible projections of the spin 


along the magnetic field. A microstate of N spins is now described by the set of Ising 


variables {o; = +1}. The corresponding magnetization along the field direction is given 


91 


by M = po San o;, where [lo is a microscopic magnetic moment. Assuming that there 


are no interactions between spins (H = 0), the probability of a microstate is 


p({oi}) = exp 


N 
BBuo >, “| ’ (IV.94) 


w=1 


Clearly, this is closely related to the example of two level systems discussed in the canonical 


ensemble, and we can easily obtain the Gibbs partition function 
Z(N,T, B) = [2 cosh(Bp0B)|"Y , (IV.95) 
and the Gibbs free energy 
G = —kgT\nZ = —NkepT In [2 cosh(GuoB)]. (IV.96) 


The average magnetization is given by 


OG 


Say) 


= Nuo tanh(GuoB). (IV.97) 
Expanding eq.(I[V.97) for small B results in the well-known Curie law for magnetic sus- 
ceptibility of non-interacting spins, 


_ OM _ Nuys 
olen. eel” 


x(T) (IV.98) 


0 


The enthalpy is simply H = (H — BM) = —BM, and Cg = —BOM/OT. 


IV.I The Grand Canonical Ensemble 


The previous sections demonstrate that while the canonical and microcanonical en- 
sembles are completely equivalent in the thermodynamic limit, it is frequently much easier 
to perform statistical mechanical computations in the canonical framework. Sometimes 
it is more convenient to allow chemical work (by fixing the chemical potential yw, rather 
than at a fixed number of particles), but no mechanical work. The resulting macro-states 
M = (T,p,x), are governed by the grand canonical ensemble. The corresponding micro- 
states Wg, contain an indefinite number of particles N(jg). As in the case of the canonical 


ensemble, the system 5, can be maintained at a constant chemical potential through contact 


92 


with a reservoir R, at temperature 7’ and chemical potential w. The probability distribu- 
tion for the micro-states of S is obtained by summing over all states of the reservoir, as in 


eq.(IV.53), and is given by 


P(us) = exp [BUN (us) — BH(us)] /Q- (IV.99) 


The normalization factor is the grand partition function, 


Q(T, u,x) = > PHN (us) BH (us) ; (IV.100) 
us 
We can reorganize the above summation by grouping together all micro-states with a 


given number of particles, i.e. 


OT, [b,.56) = does > e Bw (us) | (IV.101) 
(us|N) 


The restricted sums in eq.(IV.101) are just the N-particle partition functions. As each term 
in OQ is the total weight of all micro-states of N particles, the unconditional probability of 
finding N particles in the system is 


efEN x 
p(N) = cee (IV.102) 


The average number of particles in the system is 


1 0 a 
(N) = ago = Faw In Q, (IV.103) 


while the number fluctuations are related to the variance 


5 ica Ua Ua US AO peu WMO ta AN oy 3 ee ON) 
NIN) N= a giaga (san 2) Bue © = any 
(IV.104) 


The variance is thus proportional to N, and the relative number fluctuations vanish in the 
thermodynamic limit, establishing the equivalence of this ensemble to the previous ones. 

Because of the sharpness of the distribution for N, the sum in eq.(IV.101) can be 
approximated by its largest term at N = N* x»< N 5, ive. 


OUT, u,x) = Jim 7 e" Z(T, N, x) = e#N” Z2(T, N*, x) = ee’ OF 
he (IV.105) 


= e-A(-HN*+E-TS) _ -- 8G 


93 


where 


G(T, u,x) = E-TS— pN = —kpTInQ, (IV.106) 
is the grand potential. Thermodynamic information is obtained by using dG = —SdT — 
Ndu+ J - dx, as 


Ux OH | 7.x OF: | 


(IV.107) 


As a final example, we compute the properties of the ideal gas of non-interacting 
particles in the grand canonical ensemble. The macro-state is M = (T,u,V), and the 
corresponding micro-states {)1, 91, P2, %,:-:} have indefinite particle number. The grand 


partition function is given by 


love) N 
1 d?g,d*p; pe 
= y BuN | | pase Lees = a 
O(T, L, V) € N! i, ( h3 exp B mM 


N=0 i=1 
oo BuN N 
=> c (55) (with \ = tlie (IV.108) 
N=0 N! r V 2amkpal 
V 
= exp es] , 
and the grand potential is 
V 
G(T, p, V) = —kpTInQ= —kpTe™ (IV.109) 


But, since G = EF —T'S — uN = —PV, the gas pressure can be obtained directly as 


= kpT—. (IV.110) 


(IV.111) 


The equation of state is obtained by comparing eqs.(IV.110) and (IV.111), as P = 
kpT N/V. Finally, the chemical potential is given by 


N PX 
— T ee = 7 . ° 
u=kpT ln ( ~) kpT ln (=) (IV.112) 


94 


V. Interacting Particles 


V.A The Cumulant Expansion 


The examples studied in the previous section involve non-interacting particles. It is 
precisely the lack of interactions that renders these problems exactly solvable. Interactions, 
however, are responsible for the wealth of interesting materials and phases observed in 
nature. We would thus like to understand the role of interactions amongst particles, and 


learn how to treat them in statistical mechanics. For a general Hamiltonian, 


No 
Hy = = +U(h,--:, dN), (V.1) 
i=1 me 


the partition function can be written as 


d Pita Di? = > 
Z(T,V,N) = (See Je exp -» | exp [—6U(h,---, dn) (v.2) 


=Z(T, V, N) (exp [-BU(Gi, «+, Ev)”, 


where Zo(T,V, N) = (V/r3)™ /N! is the partition function of the ideal gas (eq.(IV.73)), 
and (oY? denotes the expectation value of O computed with the probability distribution of 
the non-interacting system. In terms of the cumulants of the random variable U/, eq.(V.2) 


can be recast as 


InZ=mnZ+ 3 as Cay (V.3) 
4 ! € 


The cumulants are related to the moments by the relations in section II.B. Since U depends 
only on {g;} which are uniformly and independently distributed within the box of volume 


V, the moments are given by 
0 a 
Gi 4 “ 
(u‘) -/ (11 a U(qi,--+, Gn). (V.4) 
i=1 


Various expectation values can also be calculated perturbatively, from 


0) =p [TL (3F4) = »|- Oe 


* oma i Sin (exp [-ikO — pu])° 


exp [—BU (di, ---, Fv)] x O 


(V.5) 


k=0 


95 


The final expectation value generates the joint cumulants of the random variables O and 


U, as 


In (exp [-ikO — 6U])° = S (ik)! is (ore: ; (V.6) 


resulting in 


(0) = 3 (= 0)" (out) (V.7) 


The simplest system for treating interactions is again the dilute gas. As discussed in 


chapter II, for a weakly interacting gas we can specialize to 


U(Gi,---,dv) = > VG-G), (V.8) 


t<j 


where V(q; —q;) is a pair-wise interaction between particles. The first correction in eq.(V.3) 


dG; ey. 
2= yf Eva -a) 


i<j (V.9) 
aa 2 fag 7V(q). 


is 


2V 
The final result is obtained by performing the integrals over the relative and center of mass 
coordinates of g; and gq; separately. (Each of the N(N — 1)/2 pairs makes an identical 
contribution. ) 


The second order correction, 
Wy = SO [WG -GVE- 4H) - VE-G) VG -@)"], (V0) 
i<j, k<l 
is the sum of [V(N — 1)/2]? terms that can be grouped as follows: 
(i) There is no contribution from terms in which the four indices {7, j,k,/} are different. 
This is because the different {q;} are independently distributed and (V(q — G)V(q — %))° 
equals (V(qi — G))° (VG. — G))”- 
(ii) There is one common index between the two pairs, e.g. {(i,7), (i,/)}. By changing 
coordinates to g; =  — gj and gj = ¥ — &, it again follows that (V(q — G)V(G — ai))° 
equals (V(qi — q))° (V(g, —@))°. The vanishing of these terms is a consequence of the 


translational symmetry of the problem in the absence of an external potential. 


96 


(iii) In the remaining N(N — 1)/2 terms the pairs are identical, resulting in 


[For-(f ua) | | (Val) 


The second term in the above equation is smaller by a factor of d?/V, where d is a 


9,0 N(N-1) 
OS) ea 


characteristic range for the potential Y. For any reasonable potential that decays with 

distance, this term vanishes in the thermodynamic limit. 

Similar groupings occur for higher order terms in this cumulant expansion. It is helpful 
to visualize the terms in the expansion diagrammatically as follows: 

(a) For a term of order @, draw ¢ pairs of points (representing g; and gj) connected 
by bonds, representing the interaction V;; = V(q — q;). An overall factor of 1/2! 
accompanies such graphs. 

(b) By multiple selections of the same index 7, two or more bonds can be joined together 
forming a diagram of interconnected points. There is a factor S¢ associated with the 
number of ways of assigning labels 1 through N to the different points of the graph. 
Ignoring the differences between N, N — 1, etc., a diagram with n, points makes 
a contribution proportional to N”*. There is typically also division by a symmetry 
factor which takes into account the number of equivalent assignments. For example, 
the diagrams involving a pair of points, calculated in eqs.(V.9) and (V.11), have a 
symmetry factor of 1/2. 

(c) Apart from these numerical prefactors, the contribution of a diagram is an integral Re 
over all the n, coordinates q;, of products of corresponding V;;. If the graphs has n, 
disconnected clusters, integration over the center of mass coordinates of the clusters 
gives a factor of V”e. 

Fortunately, many cancellations occur in calculating cumulants. In particular: 

e When calculating the moment (U e", the contribution of a disconnected diagram is sim- 

ply the product of its disjoint clusters. The coordinates of these clusters are independent 

random variables, and make no contribution to the joint cumulant (U ae This result also 

ensures the extensivity of In Z, as the surviving connected diagrams give a factor of V 

from their center of mass integration. (Disconnected clusters have more factors of V, and 

are non-extensive. ) 

e There are also one particle reducible clusters which are fully connected, yet fall to disjoint 

fragments if a single coordinate point is removed. By measuring all other coordinates 


relative to this special point, it can be seen that (in a translationally invariant system) 


97 


the value of such a diagram is the product of its disjoint fragments. It can be shown that 
such diagrams are also cancelled out in calculating the cumulant. Thus only one particle 
irreducible clusters survive in this cumulant expansion. A cluster with n, sites and ¢ bonds 
makes a contribution of order of N(N/V)":~1(6V)* to In Z. 

Ignoring terms of order of 1/N, the cumulant expansion leads to a corrected free 


energy, 


2 


F(T, V, N) = Fo(T, V, N+ (/ PGV(q) - 5 | ema? +0 (#v%)) 


N3g2y3 


(V.12) 


From this expression we can proceed to calculate other modified state functions, e.g. P = 
— OF /OV|7 y- Unfortunately, the expansion in powers of BV is not particularly useful. 
The inter-atomic potential V(7’) for most particles has an attractive tail due to van der 
Waals interactions that decays as —1/r® at large separations r = |7'|. At short distances 
the overlap of the electron clouds makes the potential strongly repulsive. Typically there 
is a minimum of depth a few hundred degrees Kelvin, at a distance of a few angstroms. 
The infinity in V(7’) at short distances makes it an unsuitable expansion parameter. This 
problem can be alleviated by a partial resummation of diagrams. For example, to get the 
correction at order of N?/V, we need to sum over all two point clusters, independent of 


the number of bonds. The resulting sum is actually quite trivial, and leads to 


InZ = InZy + 9 cor Mn Pe a +o(%) 


n 2 V Ve (v.13) 
=InZ+ “a [eile (-—6BV(q)) —1] +0 (7) : 


The quantity f(¢) = exp(—@V(q@)) — 1 is a much more convenient expansion parameter 
which goes to —1 at short distances and rapidly vanishes for large separations. In the next 


section we shall recast the perturbative expansion in terms of this quantity. 


98 


V.B The Cluster Expansion 


For short range interactions, specially with a hard core, it is much better to replace 
the expansion parameter V(q’) by f(¢) = exp (—GV(q@)) —1, which is obtained by summing 
over all possible number of bonds between two points on a cumulant graph. The resulting 
series is organized in powers of the density N/V, and is most suitable for obtaining a virial 
expansion, which expresses the deviations from the ideal gas equation of state in a power 


series 


P N 


14 
kpT V ae) 


1 + BAT) + B3(T) (*) Been 


The temperature dependent parameters, B;(T), are known as the virial coefficients and 
originate from the inter-particle interactions. Our initial goal is to compute these coeffi- 
cients from first principles. 

To illustrate a different method of expansion, we shall perform computations in the 
grand canonical ensemble. With a macro-state M = (T, uw, V), the grand partition function 
is given by 

1 /e8e ay 
O(u,T,V) ~ ePHN Z(N,T,V) Fe aut (Ss) Sy, (V.15) 
N=0 =0 


where 
Sv = [Thea ‘TK (1+ fis), (V.16) 


and fij = f(di — G)- 


The 2N(V—-1)/2 terms in Sy can now be ordered in powers of fiz as 


N 
sv = f TI oa 14+ 50 fig + Se Pee e ls (V.17) 
= 


i<j i<j,k<l 
An efficient method for organizing the perturbation series is to represent the various con- 


tributions diagrammatically. In particular we shall apply the following conventions: 


(a) Draw N dots labelled by i = 1,---,N to represent the coordinates g through gn, 


; 5 Aen re 


(b) Each term in eq.(V.17) corresponds to a product of f;,;, represented by drawing lines 


connecting i and j for each f;;. For example, the graph, 


it -<2-3 45 6 N ? 
99 


represents the integral 


( / a) ( it def) ( / Cal Gel Gafasfes) a ( / iv) | 


As the above example indicates, the value of each graph is the product of the contri- 
butions from its linked clusters. Since these clusters are more fundamental, we reformulate 
the sum in terms of them by defining a quantity bz, equal to the sum over all ¢-particle 


linked clusters (one-particle irreducible or not). For example 


bb = e = f &q=v, (V.18) 


and 
by =@—0@= [eavasa — q). (V.19) 


There are four diagrams contributing to 63, leading to 


bn = | PAP dda — Beh) + M-B)MB-h) + Mb-DMa-@) 
+ f(% -@)f(@ —- %)f(% -— G)]. 
(V.20) 
A given N-particle graph can be decomposed to n, 1-clusters, n2 2-clusters, ---, ne & 


clusters, etc. Hence, 


Sv = S° [ [oe W{ne}), (V.21) 


{ne}! & 
where the restricted sum is over all distinct divisions of N points into a set of clusters {ne}, 
such that }7, ne = N. The coefficients W({n¢}) are the number of ways of assigning N 
particle labels to groups of ng ¢-clusters. For example, the divisions of 3 particles into a 
1-cluster and a 2-cluster are 


i 2B” 1 3 


All above graphs have n; = 1 and ng = 1, and contribute a factor of bjb2 to $3; thus 
W(1,1) =3. 

In general, W({n¢}) is the number of distinct ways of grouping the labels 1,...,N 
into bins of ny ¢-clusters. It can be obtained from the total number of permutations, N!, 
after dividing by the number of equivalent assignments. Within each bin of ¢ny particles, 


equivalent assignments are obtained by: (i) permuting the @ labels in each subgroup in ¢! 


100 


ways, for a total of (¢!)"* permutations; and (ii) the n¢! rearrangements of the ng subgroups. 


Hence, 
N! 


TL, nel (2!)" 
(We can indeed check that W(1, 1) = 3!/(1!)(2!) = 3 as obtained above.) 


Using the above value of W, the expression for Sy in eq.(V.21) can be evaluated. 


W ({ne}) = (V.22) 


However, the restriction of the sum to configurations such that }°,@ng = N complicates 
the evaluation. Fortunately, this restriction disappears in the expression for the grand 


partition function in eq.(V.16), 


Tea TH = [le (V.23) 


The restriction in the second sum is now removed by noting that S°y_9 >> fash oS lne.N = 
> iis Therefore, 


Bur die bre ebuty, \ 
oa! Tees Nae) 


{ne} £ 


asaya] 


The above result has the simple geometrical interpretation that the sum over all graphs, 
connected or not, equals the exponential of the sum over connected graphs. This is a 
quite general result that is also related to the graphical connection between moments and 
cumulants discussed in sec.II.B. 


The grand potential is now obtained from 


aS = eft : be 


In eq.(V.25), the extensivity condition is used to gett G = E —TS —pwN = —PV. Thus 
the terms on the right hand side of the above equation must also be proportional to the 


volume V. This can be explicitly verified by noting that in evaluating each by there is an 


101 


integral over the center of mass coordinate that explores the whole volume. For example, 


bo = [ PAPEL -—&) =V Sf d’Gi2f(G2). Quite generally, we can set 
jm by = Vbp, (V.26) 


and the pressure is now obtained from 
P = efH Be 


The linked cluster theorem ensures G x V, since if any non-linked cluster had appeared in 
In Q, it would have contributed a higher power of V. 

Although an expansion for the gas pressure, eq.(V.27) is quite different from eq.(V.14) 
in that it involves powers of e%” rather than the density n = N/V. This difference can be 


removed by solving for the density in terms of the chemical potential, using 


dang & a) Vb 
N= ——= J) =) V.28 
aan (Se) a aa 


The equation of state can be obtained by eliminating the fugacity « = e°"/X°, between 


the equations 


n= y a 5, and = ; ay (V.29) 
he PSTN kc Lae 
é=1 é=1 
using the following steps: 
(a) Solve for x(n) from (b; = f d°g¢/V = 1) 
b. 
c=n—box? — 323 -..., (V.30) 


2 


The perturbative solution at each order is obtained by substituting the solution at the 


previous order in eq.(V.30), 


zi =n+ O(n?) 
rg =n —ben? + O(n?) (V.31) 


x23 =n — be(n — ben)? — Bn? + O(n*) =n — dan? + (263 — n° + O(n*). 


102 


(b) Substitute the perturbative result for x(n) into eq.(V.29), yielding 


bz 2 , 53 3 
P= aie! aes 
a g+ 5% + rid + 
=n —bon? + (263 2) 4 Bn? bind + End 4 (V.32) 
=n—2n?+ (03 nS + O(n') 


The final result is in the form of the virial expansion of eq.(V.14), 


BP =n+)_ B(T)n’. 
l=2 


The first term in the series reproduces the ideal gas result. The next two corrections are 


__ 82 1 fs -av@ 
By =-= = 5 | aa (¢ i (V.33) 


z= (/ ag (e-OV@ — 1)) 


(V.34) 
22 E [Ptiad taf) Gs) + | Paiad dias Ga) Gs) Ga — dis) 


1 a - - = = a 
Sea | Piet dat da) fda) f Ge = fia). 


The above example demonstrates the cancellation of the one particle reducible cluster 
that appears in bs. While all clusters (reducible or not) appear in the sum for bg, as 
demonstrated in the previous section, only the one particle irreducible ones can appear in 
an expansion in powers of density. The final expression for the ¢* virial coefficient is 

(0-1) 


BT) = — ai 


de, (V.35) 


where dy is defined as the sum over all one—particle—irreducible clusters of @ points. Note 


that in terms of de, the partition function can be organized as 


oo L 
nZ=mnZ+V)~ sat, (V.36) 
e=2 


reproducing the above virial expansion from BP = 01InZ/OV. 


103 


V.C_ The Second Virial Coefficient & van der Waals Equation 


Let us study the second virial coefficient Bj, for a typical gas using eq.(V.33). As 
discussed before, the two-body potential is characterized by a hard core repulsion at short 
distances and a van der Waals attraction at large distances. To make the computations 
easier, we shall use the following approximation for the potential, 

+00 for r < To 
V(r) = ' (V.37) 


—Uuo (ro/r)° for r >To 


which combines both features. The contributions of the two portions can then be calculated 


separately as, 


p< | * BF Gig = 1) 


To le-e) 
= | Anr?dr(—1) +f Anr?dr jecreile: —1). 
0 


ro 


(V.38) 


The second integrand can be approximated by GBug(r9/r)® in the high temperature limit, 
Guo > 1, and leads to 


| = su (1 — Buo). (V.39) 


We can define an excluded volume of Q = 4rre/3 which is 8 times the atomic volume (since 


the distance of minimum approach ro, is twice an atomic radius), to get 


p(T) = 2 (1 = | (V.40) 


e Remarks and observations: 

(1) The tail of the van der Waals attractive potential («x r~°) extends to very long sep- 
arations. Yet, its integral in eq.(V.39) is dominated by contributions from the short 
scales ro. In this limited context, the van der Waals potential is short-ranged, and 
results in corrections to the ideal gas behavior that are analytical in density n, leading 
to the virial series. 

(2) By contrast, potentials that fall off with separation as 1/r® or slower, are long-ranged. 
The integral appearing in calculation of the second virial coefficient is dominated by 


long distances, and is divergent. As a result, corrections to the ideal gas behavior can 


104 


(3) 


(4) 


(5) 


not be written in the form of a virial series, and are in fact non-analytic. A good 
example is provided by the Coulomb interactions (see problems for the test), where 
the non-analytic corrections can be obtained by summing all the ring diagrams in the 
cumulant (or cluster) expansions. 

The second virial coefficient has dimensions of volume, and (for short-range potentials) 
is proportional to the atomic volume 2. In the high temperature limit, the importance 
of corrections to ideal gas behavior can be estimated by comparing the first two terms 
of eq.(V.14), 


Bon? ~— Bo Atomic volume gas density 


(V.41) 


nm n-! — volume per particle in gas = liquid density 


This ratio is roughly 10~3 for air at room temperature and pressure. The corrections 
to ideal gas behavior are thus small at low densities. On dimensional grounds, a similar 
ratio is expected for the higher order terms, Byn’/By_ jn‘, in the virial series. We 
may thus suspect the convergence of the series at high enough densities (when the gas 
liquifies). 

The virial expansion breaks down not only at high densities, but also at low temper- 
atures. This is suggested by the divergences in eqs.(V.40) and (V.38) as T’ — 0, and 
reflects the fact that in the presence of attractive interactions the particles can lower 
their energy at low temperatures by condensing into a liquid state. 

The truncated virial expansion, 


aay ey ie i V.42 
ka ne $( ) ne 5 (V.42) 


can be rearranged as 


1 uoQ 2 Q n N 
a pe? | Ha eet) eee, ys 
( = n?) n( tg oe ) T-naf2~ Vvonop ‘*) 


This is precisely in the form of the van der Waals equation 


2 
upQ N 
P af, 
+ (y) 


and we can identify the van der Waals parameters, a = upQ/2 and b = 0/2. 


lv s S] = NksT, (V.44) 


Physical interpretation of the van der Waals equation: Historically, van der Waals 


suggested eq.(V.44) on the basis of experimental results for the equation of state of various 


105 


gases, towards the end of thel9th century. At that time the microscopic interactions 
between gas particles were not known, and van der Waals postulated the necessity of an 
attractive interaction between gas atoms based on the observed decreases in pressure. It 
was only later on that such interactions were observed directly, and then attributed to the 
induced dipole-dipole forces by London. The physical justification of the correction terms 
is as follows. 

(a) There is a correction to the gas volume V due to the hard core exclusions. At first sight, 
it may appear surprising that the excluded volume b, in eq.(V.44) is one half of the volume 
that is excluded around each particle. This is because this factor measures a joint excluded 
volume involving all particles in phase space. In fact, the contribution of coordinates to 


the partition function of the hard-core gas can be estimated at low densities, from 


Sy -[ Hae _ ~Vv(V —9)(V —29)---(V —(N—1)9) x 
. (V.45) 
The above result is obtained by adding particles one at a time, and noting that the available 
volume for the m'® particle is (V —mQ). At low densities, the overall effect is a reduction 
of the volume available to each particle by approximately 2/2. Of course, the above result 
is only approximate, since the effects of excluded volume involving more than two particles 
are not correctly taken into account. The relatively simple form of eq.(V.45) is only exact 
at for spatial dimensions d = 1 and infinity. As proved in problems for the test, the exact 
excluded volume in d = 1 is in fact 9. 
(b) The decrease in pressure P, due to attractive interactions, is somewhat harder to 
quantify. In sec.III.F, the gas pressure was related to the impacts of particles on a wall via 


P = (nvz)(2mvz) ao nmv?, (V.46) 
Va< 


where the first term is the number of collisions per unit time and area, while the second is 
the momentum imparted by each particle. For the ideal gas, the usual equation of state 
is recovered by noting that the average kinetic energy is mv2/2 = kgT/2. Attractive 


interactions lead to a reduction in pressure given by 
dP =dn (mv?) +no (mi?) : (V.47) 


While different statistical ensembles give the same pressure, which is a bulk state function, 
they may lead to different behaviors at the surface. We must thus be careful, and consistent, 


in evaluation of eq.(V.47), which depends of surface properties. 


106 


In a canonical ensemble, the gas density is reduced at the walls. This is because the 
particles in the middle of the box experience an attractive potential V from all sides, while 
at the edge only an attractive energy of V/2 is available from half of the space. The 


resulting change in density is approximately 
nen (eewe - ) = BnV/2. (V.48) 
Integrating the interaction of one particle in the bulk with the rest gives 
y= fe P Vattr.(r) n = —nQuo. (V.49) 


The change in density thus gives the pressure correction of dP = —n?Quo/2 calculated 
in eq.(V.44). There is no correction to the kinetic energy of the particles impinging on 
the wall, since in the canonical formulation the probabilites for momentum and location 
of the particles are independent variables. The probability distribution for momentum is 
uniform in space, giving the average kinetic energy of kgT/2 for each direction of motion. 

A different explanation is presented in a kinetic formulation in which particles follow 
the deterministic Hamiltonian equations of motion. In this formulation, the impinging 
particles lose kinetic energy in approaching the wall from the surface, since they have to 
climb out of the potential well set up by the attractions of bulk particles. The reduction 


in kinetic energy is given by 


[ 1 1 
iS = fe P ce (tte = — 5g rQuo. (V.50) 


The reduced velocities lead to an increase in the surface density in this case, as the slower 


particles spend a longer time 7 in the vicinity of the wall! The relative change in density 


is given by 
6n OT Ux 1 dv2 
we in = nV /2. (v.51) 


The increase in density is precisely the opposite of the result of eq.(V.48) in the canonical 
formulation. However, with the decrease in kinetic energy calculated in eq.(V.50), it again 


leads to the correct reduction in pressure. 


107 


V.D Breakdown of the van der Waals equation 


As discussed in sec.I.I, mechanical stability of a gas requires the positivity of the 
isothermal compressibility, hr = —V~' OV/OP|,.. This condition can be obtained by 
examining density fluctuations in a grand canonical ensemble. The probability of finding 


N particles in a volume V is given by eq.(I[V.102) as 


eFEN 7(T, N,V) 


N,V) = —————_—_.. V52 
p(N,V) : (V.52) 
Since for a gas In Q = —GG = PV/kpT, eqs.(IV.103) and (IV.104) simplify to 
jean = 202) y P| 
O(Bp) Ou PY (V.53) 
(2) 0? (In Q) __ ON fe. k ON 
°O(Bn? (O(n) On | py 
Dividing the two equations, and using the chain rule, results in 
(N?), kpT ON kpT ON OV 
See een pes Sing — — Tr. 54 
N V OP lpy V OV bse OP lve ep er MB?) 


The positivity of Kr is thus tied to that of the variance of N. A stable value of N 
corresponds to a maximum of the probability p(N,V), i.e. a positive compressibility. A 
negative Kk actually corresponds to a minimum in p(N, V) implying that the system is least 
likely to be found at such densities. Fluctuations in density will then occur spontaneously 
and change the density to a stable value. 

Any approximate equation of state, such as the van der Waals equation, must at 
least satisfy the stability requirements. However, the van der Waals isotherms contain a 
portion with — 0P/OV|,, < 0, for temperatures less than a critical value T,. The negative 
compressibility implies an instability towards forming domains of lower and higher density, 
i.e. phase separation. The attractive interactions in real gases do indeed result in a liquid— 
gas phase separation at low temperatures. The isotherms then include a flat portion, 
OP/OV|, = 0, at the coexistence of the two phases. Can the (unstable) van der Waals 
isotherms be used to construct the phase diagram of a real gas? 

One way of doing so is by the following Maxwell construction: The variations of the 


chemical potential (7, P), along an isotherm are obtained by integrating eq.(V.53), as 


P ! 
VEEP 
dus = AP, = BC P= TPA) +f fi cea 


V.55 
Oe. (V.55) 


108 


Since the van der Waals isotherms for T < JT. are non-monotonic, there is a range of 
pressures that correspond to three different values, {U4}, of the chemical potential. The 
possibility of several values of 4 at a given temperature and pressure indicates phase 
coexistence. In equilibrium, the number of particles in each phase N,, adjusts so as 
to minimize the Gibbs free energy G = 0, MaNa. Clearly, the phase with lowest a 
will acquire all the particles. A phase transition occurs when two branches of allowed 
chemical potentials intersect. From eq.(V.55), the critical pressure P., for this intersection 


is obtained from the condition 


¢ * GP'V(T. P') =0. (V.56) 
Pe 

A geometrical interpretation of the above result is that P. corresponds to a pressure that 
encloses equal areas of the non-monotonic isotherm on each side. The Maxwell construction 
approach to phase condensation is somewhat unsatisfactory, as it relies on integrating a 
clearly unphysical portion the van der Waals isotherm. A better approach that makes the 


approximations involved more apparent is presented in the next section. 


V.E Mean Field Theory of Condensation 


In principle, all properties of the interacting system, including phase separation, 
are contained within the thermodynamic potentials that can be obtained by evaluating 
Z(T,N) or O(T,). Phase transitions, however, are characterized by discontinuities in 
various state functions and must correspond to the appearance of singularities in the par- 
tition functions. At first glance, it is somewhat surprising that any singular behavior 
should emerge from computing such well behaved integrals (for short-ranged interactions) 


as 
2 
a 


~6BY VG-G)] - (V.57) 


t<j 


N 

Th Brida D 
Z(T,N,V) = f SLE exp =e 
, i=1 


Instead of evaluating the integrals perturbatively, we shall now set up a reasonable ap- 
proximation scheme. The contributions of the hard core and attractive portions of the 


potential are again treated separately, and the partition function approximated by 


REN) 62 =O) == 0) p80). (V.58) 
—_—_ 


Excluded volume effects 


109 


Here U represents an average attraction energy, obtained by assuming a uniform density 
n= N/V, as 


— 9 
tJ (V.59) 


The parameter u describes the net effect of the attractive interactions. Substituting into 


eq.(V.58) leads to the following approximation for the partition function 


(V —NQ/2)* BuN? 
Z(T,N ey ——_______ —— |. s 
From the resulting free energy, 
uN? 
F=-—-kpTlnZ = —NkgT ln(V — NOQ/2)+ NkpT ln(N/e) + 3NkpT Ind — BV (V.61) 
we obtain the expression for the pressure in the canonical ensemble as 
OF NkpT Ne 
Poan — — KB = (V.62) 


Vinny V—NOQ/2  2Vv? 


Remarkably, the uniform density approximation reproduces the van der Waals equa- 
tion of state. However, the self-consistency of this approximation can now be checked. 
As long as kr is positive, eq.(V.54) implies that the variance of density vanishes for large 
volumes as Ga): =kpTn?kr/V. But nr diverges at T., and at lower temperatures its 
negativity implies an instability towards density fluctuations as discussed in the previous 
section. When condensation occurs, there is phase separation into high (liquid) and low 
(gas) density states, and the uniform density assumption becomes manifestly incorrect. 
This difficulty is circumvented in the grand canonical ensemble. Once the chemical poten- 
tial is fixed, the number of particles (and hence density) in this ensemble is automatically 
adjusted to that of the appropriate phase. 

As the assumption of a uniform density is correct for both the liquid and gas phases, 
we can use the approximations of eqs.(V.59) and (V.60) to estimate the grand partition 


function 


foe) 2 
QT, u,V = e%"N Z(T,N,V) =~ S_ exp lv n(S = 5) + pun +AN , (V.63) 


N=0 N=0 


where A = 1+ Gy — In(A3). As in any sum over exponentials in N, the above expression 


is dominated by a particular value of particle number (hence density), and given by 


7 Vee BuN? 
Q(T. 1,¥) exp { max lva+ivm (4-5) + DV |}. (V.64) 


Hence, the grand canonical expression for the gas pressure is obtained from 


GPa = aus = max|V(n)|n, (V.65) 
where Q 
Vin) =nA+nIn Ce — 5) + ai (V.66) 


The possible values of density are obtained from dW/ dnl, = 0, and satisfy 


Q 1 
A=-1 Sees —_—~ — a V.67 

n(n, s) tas mul EOD) 
The above equation in fact admits multiple solutions n, for the density. Substituting the 


resulting A into eq.(V.65) leads after some manipulation to 


Pca [ “Sts . nud PAD Gia, (V.68) 
i.e. the grand canonical and canonical values of pressure are identical at a particular 
density. However, if eq.(V.67) admits multiple solutions for the density at a particular 
chemical potential, the correct density is uniquely determined as the one that maximizes 
the canonical expression for pressure (or for ~(n)). 

The mechanism for the liquid—gas phase transition is therefore the following. The 
sum in eq.(V.63) is dominated by two large terms at the liquid and gas densities. At 
a particular chemical potential, the stable phase is determined by the larger of the two 
terms. The phase transition occurs when the dominant term changes upon varying the 
temperature. In mathematical form 

DV Piss for st 
InQ= lim In [e?V Pavia + PV Peas] = : (V.69) 
ss BV Piqua for T <T* 


The origin of the singularity in density can thus be traced to the thermodynamic limit of 


V — co. There are no phase transitions in finite systems! 


uae 


V.F Variational Methods 


Perturbative methods provide a systematic way of incorporating the effect of inter- 
actions, but are impractical for the study of strongly interacting systems. While the first 
few terms in the virial series slightly modify the behavior of the gas, an infinite number 
of terms have to be summed to obtain the condensation transition. An alternative, but 
approximate, method for dealing with strongly interacting systems is the use of variational 
methods. 

Suppose that in an appropriate ensemble we need to calculate Z = tr Gack In the 
canonical formulation, Z is the partition function corresponding to the Hamiltonian H at 
temperature kgT = 1/3, and for a classical system tr refers to the integral over the phase 
space of N particles. However, the method is more general and can be applied to Gibbs or 
Grand partition functions with the appropriate modification of the exponential factor; also 
in the context of quantum systems where tr is a sum over all allowed quantum microstates. 
Let us assume that calculating Z is very difficult for the (interacting) Hamiltonian H, but 
that there is another Hamiltonian Ho acting on the same set of degrees of freedom for which 
the calculations are easier. We then introduce a Hamiltonian H(A) = Ho + A (H — Ho), 


and a corresponding partition function 
Z(A) = tr{exp [—GHo — AB (H — Ho)}}, (V.70) 


which interpolates between the two as \ changes from zero to one. It is then easy to prove 
the convexity condition 


d? In Z(X) 


a = B((H—Ho)”) 20, (v.71) 


where () is an expectation value with the appropriately normalized probability. 
From the convexity of the function, it immediately follows that 


dlnZ 
dX 


In Z(A) > Nn Z(0) +A (Wa72) 


A=0 


But it is easy to see that dln Z/dA|,_) = 8 (Ho — H)°, where the superscript indicates 


expectation values with respect to Ho. Setting \ = 1, we obtain 


In Z >In Z(0) + 6 (Ho)® — B(H)°. (V.73) 


112 


Eq.(V.73), known as the Gibbs inequality, is the basis for most variational estimates. 
Typically, the ‘simpler’ Hamiltonian Ho (and hence the right hand side of eq.(V.73)) in- 
cludes several parameters {n.}. The best estimate for In Z is obtained by finding the 
maximum of the right hand side with respect to these parameters. It can now be checked 
that the approximate evaluation of the grand partition function Q in the preceding sec- 
tion is equivalent to a variational treatment with 79 corresponding to a gas of hard-core 


particles of density n, for which (after replacing the sum by its dominant term) 
In Qo = BuN +InZ=V |n (1+ Bu —In(A®)) +nIn (m — 3) : (V.74) 


The difference H — ‘Ho contains the attractive portion of the two body interactions. In the 
regions of phase space not excluded by the hard core interactions the gas in 7p is described 


by a uniform density n. Hence 
B (Ho —H)° = BV—u, (V.75) 


and 


OP = as > |n (1+ Gu —In(A*)) +nIn G — 5) + 5 Bun’, (V.76) 


which is the same as eq.(V.66). The density n is now a parameter on the right hand side 
of eq.(V.76). Obtaining the best variational estimate by maximizing with respect to n is 


then equivalent to eq.(V.65). 


V.G_ Corresponding States 


We now have a good perturbative understanding of the behavior of a dilute interacting 
gas at high temperatures. At lower temperatures, attractive interactions lead to conden- 
sation into the liquid state. The qualitative behavior of the phase diagram is the same for 
most simple gases. There is a line of transitions in the coordinates (P,7T) (corresponding 
to the coexistence of liquid and gas in the (V,T) plane) that terminates at a so called 
critical point. It is thus possible to transform a liquid to a gas without encountering any 
singularities. Since the ideal gas law is universal, i.e. independent of material, we may 
hope that there is also a generalized universal equation of state (presumably more com- 
plicated) that describes interacting gases, including liquid/gas condensation phenomena. 


This hope motivated the search for a law of corresponding states, obtained by appropriate 


113 


rescalings of state functions. The most natural choice of scales for pressure, volume, and 
temperature are those of the critical point, (P., Vc, T.). 
The van der Waals equation is an example of a generalized equation of state. Its 


critical point is found by setting OP/OV|, and 0?P/0V?|,, to zero. The former is the 


lr 
limit of the flat coexistence portion of liquid/gas isotherms; the latter follows from the 
stability requirement kr > 0 (see the discussion after eq.(I.72)). The coordinates of the 


critical point are thus obtained from solving the following coupled equations, 


_ kpT _ a 
 y—b~ v? 
OP kpT 2a 
—| = ———— + — =0 
Ov |p (uv — b)? ss v3 ? (V.77) 
OP) _ %keT 6a _ | 
OU? eee (ORB) - 07 


where v = V/N is the volume per particle. The solution to these equations is 


a 

Po = sap 
8a 

kale = oa, 


Naturally, the critical point depends on the microscopic Hamiltonian (e.g. on the 2- 
body interaction) via the parameters a and b. However, we can scale out such dependencies 
by measuring P, T, and v in units of P., T. and v.. Setting P, = P/P., vr = v/vce, and 


T, = T/T., a reduced version of the van der Waals equation is obtained as 


B- Lp 3 
P, = = —— - as CC. V.79 
3u,—-1/3 v2 ( ) 
We have thus constructed a universal (material independent) equation of state. Since the 
original van der Waals equation depends only on two two parameters, eqs.(V.78) predict 


a universal dimensionless ratio, 


P.v 3 
a =), 375, V.80 
kpT. 8 ( ) 


Experimentally, this ratio is found to be in the range of 0.28 to 0.33. The van der Waals 


equation is thus not a good candidate for the putative universal equation of state. 


114 


We can attempt to construct a generalized equation empirically by using three inde- 
pendent critical coordinates, and finding P,. = p,(v,,T7;) from the collapse of experimental 
data. Such an approach has some limited success in describing similar gases, e.g. the 
sequence of noble gases Ne, Xe, Kr, etc. However, different types of gases (e.g. diatomic, 
ionic, etc.) show rather different behaviors. This is not particularly surprising given the 
differences in the underlying Hamiltonians. We know rigorously from the virial expansion 
that perturbative treatment of the interacting gas does depend on the details of the mi- 
croscopic potential. Thus the hope of finding a universal equation of state for all liquids 
and gases has no theoretical or experimental justification; each case has to be studied 
separately starting from its microscopic Hamiltonian. It is thus quite surprising that the 
collapse of experimental data does in fact work very well in the vicinity of the critical 


point, as described in the next section. 


V.H Critical Point Behavior 


To account for the universal behavior of gases close to their critical point, let us 
examine the isotherms in the vicinity of (P.,vc,T~). For T > T., a Taylor expansion of 


P(T,v) in the vicinity of v., for any T gives, 


OP One Lt O38? 
PON SP Big a Gage sl pr | ety eee GBI 
(T,v) (T, ve) + a se Uc) 5 Baz a Ve)" + @ Fo3 7 Ue)? + (V.81) 
Since 0P/Ov|, and 0?P/ dv? | are both zero at T., the expansion of the derivatives around 


the critical point gives 


P(T, ve) = P, +a(T —T.) +0 [(T - T)?] 


OP, | ; 
Ov on aa a(T = T.) a O [(r =: T.) | ’ 
0?P V.82 
— =o(T-T.)+0 |(T-T)’], ee 
Ov Tv. 
O° P 
But ee =-c+O((T —-T)], 


where a, b, and c are material dependent constants. The stability condition, 6Pdv < 0, 
requires a > 0 (for T > T.) and c > 0 (for T = T.), but provides no information on the 
sign of b. If an analytical expansion is possible, the isotherms of any gas in the vicinity of 


its critical point must have the general form, 
b 
P(T,v) = P.+a(T—T,) —a(T-T.)(v—ve) +5 (P—Te)(v- ve)” — (vv) 9 +- .-. (V.83) 


115 


Note that the third and fifth terms are of comparable magnitude for (T — T.) ~ (uv — v-)?. 
The fourth (and higher order terms) can then be neglected when this condition is satisfied. 
e The analytic expansion of eq.(V.83) results in the following predictions for behavior 
close to the critical point: 

(a) The gas compressibility diverges on approaching the critical point from the high tem- 


perature side, along the critical isochore (v = v,) as, 


1 OP 1 
li T, ve) = -—- — = ———_.. V.84 
pe K(T, ee) Der OU: |r v,a(T — To) ( ) 
(b) The critical isotherm (T = T,) behaves as 
P= P.-F(v—u)e +o. (V.85) 


(c) Eq.(V.83) is manifestly inapplicable to T < T,. However, we can try to extract infor- 
mation for the low temperature side by applying the Maxwell construction to the unstable 
isotherms. Actually, dimensional considerations are sufficient to show that on approaching 
T.. from the low temperature side, the specific volumes of the coexisting liquid and gas 


phases approach each other as 


lim(Ugag— Diguid) o (Le = pM. (V.86) 
ToT 

The liquid—gas transition for T < J, is accompanied by a discontinuity in density, 
and the release of latent heat DL. This kind of transition is usually referred to as first 
order or discontinuous. The difference between the two phases disappears as the line of 
first order transitions terminates at the critical point. The singular behavior at this point 
is attributed to a second order or continuous transition. Eq.(V.83) follows from no more 
than the constraints of mechanical stability, and the assumption of analytical isotherms. 
Although there are some unknown coefficients, eqs.(V.85)—(V.86) predict universal forms 
for the singularities close to the critical point and can be compared to experimental data. 
The experimental results do indeed confirm the general expectations of singular behavior, 

and there is universality in the results for different gases which can be summarized as 


(a) The compressibility diverges on approaching the critical point as 


lim «(T,v.)«(T—-T.) 7, with y#13 . (V.87) 
ToT 


116 


(b) The critical isotherm behaves as 
(P—P.)x(v—v,)*, with 625.0 . (V.88) 
(c) The difference between liquid and gas densities vanishes close to the critical point as 


pim_(Ptiquid — Peas) X pute Case — Vliquia) « (Te — T)P, with B0.3 .  (V.89) 
These results clearly indicate that the assumption of analyticity of the isotherms, 
leading to eq.(V.83), is not correct. The exponents 6, y, and 3 appearing in eqs.(V.88)— 
(V.89) are known as critical indices. Understanding the origin, universality, and numerical 
values of these exponents is the a fascinating subject explored in the modern theory of 


critical phenomena. 


117 


V.E Mean Field Theory of Condensation 


In principle, all properties of the interacting system, including phase separation, 
are contained within the thermodynamic potentials that can be obtained by evaluating 
Z(T,N) or O(T,). Phase transitions, however, are characterized by discontinuities in 
various state functions and must correspond to the appearance of singularities in the par- 
tition functions. At first glance, it is somewhat surprising that any singular behavior 
should emerge from computing such well behaved integrals (for short-ranged interactions) 


as 


Z(T,N,V) = Hin PpidG ee WE -F V.57 
(T,N,V) = —= Wien. oP ge = (Gi — q5)| - (V.57) 
= “J 


Instead of evaluating the integrals perturbatively, we shall now set up a reasonable ap- 
proximation scheme. The contributions of the hard core and attractive portions of the 


potential are again treated separately, and the partition function approximated by 


Z(T, N,V) = — =~ V(V -Q)---(V — (N — 1)Q) exp(—GU). (V.58) 
—_—_—$—_—_— 


Excluded volume effects 


109 


Here U represents an average attraction energy, obtained by assuming a uniform density 
n= N/V, as 


— 9 
tJ (V.59) 


The parameter u describes the net effect of the attractive interactions. Substituting into 


eq.(V.58) leads to the following approximation for the partition function 


(V —NQ/2)* BuN? 
Z(T,N ey ——_______ —— |. s 
From the resulting free energy, 
uN? 
F=-—-kpTlnZ = —NkgT ln(V — NOQ/2)+ NkpT ln(N/e) + 3NkpT Ind — BV (V.61) 
we obtain the expression for the pressure in the canonical ensemble as 
OF NkpT Ne 
Poan — — KB = (V.62) 


Vinny V—NOQ/2  2Vv? 


Remarkably, the uniform density approximation reproduces the van der Waals equa- 
tion of state. However, the self-consistency of this approximation can now be checked. 
As long as kr is positive, eq.(V.54) implies that the variance of density vanishes for large 
volumes as Ga): =kpTn?kr/V. But nr diverges at T., and at lower temperatures its 
negativity implies an instability towards density fluctuations as discussed in the previous 
section. When condensation occurs, there is phase separation into high (liquid) and low 
(gas) density states, and the uniform density assumption becomes manifestly incorrect. 
This difficulty is circumvented in the grand canonical ensemble. Once the chemical poten- 
tial is fixed, the number of particles (and hence density) in this ensemble is automatically 
adjusted to that of the appropriate phase. 

As the assumption of a uniform density is correct for both the liquid and gas phases, 
we can use the approximations of eqs.(V.59) and (V.60) to estimate the grand partition 


function 


foe) 2 
QT, u,V = e%"N Z(T,N,V) =~ S_ exp lv n(S = 5) + pun +AN , (V.63) 


N=0 N=0 


V.G Corresponding States 


We now have a good perturbative understanding of the behavior of a dilute interacting 
gas at high temperatures. At lower temperatures, attractive interactions lead to conden- 
sation into the liquid state. The qualitative behavior of the phase diagram is the same for 
most simple gases. There is a line of transitions in the coordinates (P,T) (corresponding 
to the coexistence of liquid and gas in the (V,7) plane) that terminates at a so called 
critical point. It is thus possible to transform a liquid to a gas without encountering any 
singularities. Since the ideal gas law is universal, i.e. independent of material, we may 
hope that there is also a generalized universal equation of state (presumably more com- 
plicated) that describes interacting gases, including liquid/gas condensation phenomena. 
This hope motivated the search for a law of corresponding states, obtained by appropriate 
rescalings of state functions. The most natural choice of scales for pressure, volume, and 


temperature are those of the critical point, (P., Ve, Tc). 


The van der Waals equation is an example of a generalized equation of state. Its 
critical point is found by setting OP/OV|,, and @ PIOV "|. to zero. The former is the 
limit of the flat coexistence portion of liquid/gas isotherms; the latter follows from the 
stability requirement kr > 0 (see the discussion after eq.(I.72)). The coordinates of the 


critical point are thus obtained from solving the following coupled equations, 


_ kel a 
eb wv 
OP _ kel Oy 
Ole C=pt ae = (V.77) 
P| — keT 6a _ 
Ov |, (v-b)3 ut 


where v = V/N is the volume per particle. The solution to these equations is 


113 


a 


Fe ore 
8a 
7 ai 
kpTe 7b 


Naturally, the critical point depends on the microscopic Hamiltonian (e.g. on the 2- 
body interaction) via the parameters a and b. However, we can scale out such dependencies 
by measuring P, T, and v in units of P., T. and v.. Setting P, = P/P., vy = v/vc, and 


T, = T/T,, a reduced version of the van der Waals equation is obtained as 


8 T, 3 
P, = = —— - as SC. V.79 
3u,—-1/3 v2 ( ) 
We have thus constructed a universal (material independent) equation of state. Since the 
original van der Waals equation depends only on two two parameters, eqs.(V.78) predict 


a universal dimensionless ratio, 


P.ve 3 
= — = 0.375. V.80 
kel, 8 ( ) 


Experimentally, this ratio is found to be in the range of 0.28 to 0.33. The van der Waals 


equation is thus not a good candidate for the putative universal equation of state. 


We can attempt to construct a generalized equation empirically by using three inde 
pendent critical coordinates, and finding P,. = p,(v,,T7;) from the collapse of experimental 
data. Such an approach has some limited success in describing similar gases, e.g. the 
sequence of noble gases Ne, Xe, Kr, etc. However, different types of gases (e.g. diatomic, 
ionic, etc.) show rather different behaviors. This is not particularly surprising given the 
differences in the underlying Hamiltonians. We know rigorously from the virial expansion 
that perturbative treatment of the interacting gas does depend on the details of the mi 
croscopic potential. Thus the hope of finding a universal equation of state for all liquids 
and gases has no theoretical or experimental justification; each case has to be studied 
separately starting from its microscopic Hamiltonian. It is thus quite surprising that the 
collapse of experimental data does in fact work very well in the vicinity of the critical 


point, as described in the next section. 


114 


V.H Critical Point Behavior 


To account for the universal behavior of gases close to their critical point, let us 
examine the isotherms in the vicinity of (P.,u-,T.). For T > T., a Taylor expansion of 


P(T,v) in the vicinity of v., for any T gives, 


OP iro er torr 
P(T,v) = P(T, ve) + — =O yea 


7a (v—ve)? +++. (V.81) 


Since 0P/Ov|, and 0?P/ dv? | are both zero at T., the expansion of the derivatives around 


the critical point gives 


P(T,v.) = P, + 0(T —T,) + O [(T -T)?] 


OP ‘ 
a ae = -a(T —T.) +O [(T —T.)*], 
0?P V.82 
— =(T-T.)+0|[(T-T)’], ( ) 
Ov Tv. 
03P 
De |p Ser O | aT. 


where a, b, and c are material dependent constants. The stability condition, 6Pdv < 0, 
requires a > 0 (for T > T.) and c > 0 (for T = T.), but provides no information on the 
sign of b. If an analytical expansion is possible, the isotherms of any gas in the vicinity of 


its critical point must have the general form, 


P(T,v) = P.+a(T-T.) ~a(P—T_)(v—ve) +5 (P ~Te)(v— ve)? (v—ve) + --. (V.83) 


115 


Note that the third and fifth terms are of comparable magnitude for (T — T.) ~ (uv — v-)?. 
The fourth (and higher order terms) can then be neglected when this condition is satisfied. 
e The analytic expansion of eq.(V.83) results in the following predictions for behavior 
close to the critical point: 

(a) The gas compressibility diverges on approaching the critical point from the high tem- 


perature side, along the critical isochore (v = v,) as, 


1 OP 1 
li T, ve) = -—- — = ———_.. V.84 
pe K(T, ee) Der OU: |r v,a(T — To) ( ) 
(b) The critical isotherm (T = T,) behaves as 
P= P.-F(v—u)e +o. (V.85) 


(c) Eq.(V.83) is manifestly inapplicable to T < T,. However, we can try to extract infor- 
mation for the low temperature side by applying the Maxwell construction to the unstable 
isotherms. Actually, dimensional considerations are sufficient to show that on approaching 
T.. from the low temperature side, the specific volumes of the coexisting liquid and gas 


phases approach each other as 


lim(Ugag— Diguid) o (Le = pM. (V.86) 
ToT 

The liquid—gas transition for T < J, is accompanied by a discontinuity in density, 
and the release of latent heat DL. This kind of transition is usually referred to as first 
order or discontinuous. The difference between the two phases disappears as the line of 
first order transitions terminates at the critical point. The singular behavior at this point 
is attributed to a second order or continuous transition. Eq.(V.83) follows from no more 
than the constraints of mechanical stability, and the assumption of analytical isotherms. 
Although there are some unknown coefficients, eqs.(V.85)—(V.86) predict universal forms 
for the singularities close to the critical point and can be compared to experimental data. 
The experimental results do indeed confirm the general expectations of singular behavior, 

and there is universality in the results for different gases which can be summarized as 


(a) The compressibility diverges on approaching the critical point as 


lim «(T,v.)«(T—-T.) 7, with y#13 . (V.87) 
ToT 


116 


(b) The critical isotherm behaves as 
(P—P.)x(v—v,)*, with 625.0 . (V.88) 
(c) The difference between liquid and gas densities vanishes close to the critical point as 


pim_(Ptiquid — Peas) X pute Case — Vliquia) « (Te — T)P, with B0.3 .  (V.89) 
These results clearly indicate that the assumption of analyticity of the isotherms, 
leading to eq.(V.83), is not correct. The exponents 6, y, and 3 appearing in eqs.(V.88)— 
(V.89) are known as critical indices. Understanding the origin, universality, and numerical 
values of these exponents is the a fascinating subject explored in the modern theory of 


critical phenomena. 


117 


VI. Quantum Statistical Mechanics 


There are limitations to the applicability of classical statistical mechanics. The need 
to include quantum mechanical effects becomes specially apparent at low temperatures. 
In this section we shall first demonstrate the failure of the classical results in the contexts 
of heat capacities of molecular gases and solids, and the ultra-violet catastrophe in black 


body radiation. We shall then reformulate statistical mechanics using quantum concepts. 


VI.A Dilute Polyatomic Gases 


Consider a dilute gas of polyatomic molecules. The Hamiltonian for each molecule of 


n atoms is 


n = 


2 
Hy = 0 + VG dn): (V1.1) 
i=1 


2 


where the potential energy V, contains all the information on molecular bonds. For sim- 
plicity, we have assumed that all atoms in the molecule have the same mass. If the masses 
are different, the Hamiltonian can be brought into the above form by rescaling the coordi- 
nates gi by Jm/m (and the momenta by /m/mi), where m, is the mass of the i*® atom. 


Ignoring the interactions between molecules, the partition function of a dilute gas is 


n 


N 
vis 1 “ d°p;,d?q; p? 
ms fragt eS pele ch a kB se ESE 7.2.4, ; V1.2 
Z(\N)=s7 =m JH sy 22 Sm 7 BV (Gis ++ Gn) (V1.2) 


The chemical bonds that keep the molecule together are usually quite strong (ener- 
gies of the order of electron volts). At typical accessible temperatures, which are much 
smaller than the corresponding dissociation temperatures (~ 10*°K’), the molecule has 
a well defined shape and only undergoes small deformations. The contribution of these 
deformations to the one particle partition function Z,, can be computed as follows: 

(a) The first step is to find the equilibrium positions, (¢j,...,@*), by minimizing the 

potential V. 

(b) The energy cost of small deformations about equilibrium is then obtained by setting 


qi = ¢; + u;, and making an expansion in powers of w, 


gee SES, ey 
=VR+e uj, jg + O(u?). VL3 
ypH=Vr+ 5 Xu oe ee Ui,at;,6 + O(u") ( ) 


(Here i,7 = 1,---,n, identify the atoms, and a,Z = 1,2,3 label a particular com- 
ponent.) Since the expansion is around a stable equilibrium configuration, the first 
derivatives are absent in eq.(VI.3), and the matrix of second derivatives is positive 
definite, i.e. it has only non-negative eigenvalues. 

(c) The normal modes of the molecule are obtained by diagonalizing the 3n x 3n matrix 
0°V /Oqi,c.0q;,8. The resulting 3n eigenvalues K, indicate the stiffness of each mode. 
We can change variables from the original deformations {v;}, to the amplitudes {w,}, 
of the eigenmodes. The corresponding conjugate momenta are p, = mtis. Since 
the transformation from {i;} to {u,} is unitary (preserving the length of a vector), 


>>; Pi? = >2, p?, and the quadratic part of the resulting deformation Hamiltonian is 
=Vr+ = 7 pet Hs Si] ; (V1.4) 


(Such transformations are also canonical, preserving the measure of integration in 

phase space, |]; , dui,a@pi,a = |], dtsdps-) 

The average energy of each molecule is the expectation value of the above Hamiltonian. 
Since each quadratic degree of freedom classically contributes a factor of kgT/2 to the 


energy, 
3n+m 


(Hi) =V* + kpT. (V1.5) 


Only modes with a finite stiffness can store potential energy, and m is defined as the 
number such modes with non-zero K,. The following symmetries of the potential force 
some eigenvalues to zero: 

(a) Translation symmetry: Since V(qi + G--:,G) + €) = V(di,-+°5 dn ); no energy is 
stored the center of mass coordinate Q = a dal Ns LE: yiar= V(Q +2), and the 
corresponding three values of K¢rans are zero. 

(b) Rotation symmetry: There is also no potential energy associated with rotations of 
the molecule, and Ky, = 0 for the corresponding stiffnesses. The number of rota- 
tional modes, 0 < r < 3, depends on the shape of the molecule; for example, a rod 
shaped molecule has r = 2, as a rotation parallel to its axis does not result in a new 
configuration. 

The remaining m = 3n — 3 —,r eigenvectors of the matrix have non-zero stiffness, and 
correspond to the vibrational normal modes. The energy per molecule, from eq.(VI.5), is 
thus 

6n—3-1Tr 


119 


The corresponding heat capacities, 


oo On 3 Dep, did’ Cn Og ha = 


(6n -—1-—r) 


5 kp, (VI.7) 


are temperature independent. The ratio y = Cp/Cy is easily measured in adiabatic 
processes. Values of y, expected on the basis of the above argument, are listed below for 


a number of different molecules. 


Monatomic He aa r= y= 5/3 

Diatomic Oz or CO t=2 f=2 y = 9/7 

Linear triatomic O-—C-—O i= 3 r= 2 vy = 15/13 

Planar triatomic n/o\" n= 3 p= 3 a ee ard: 
Tetra-atomic NH3 iA r=s ry = 20/18 = 10/9 


Measurements of the heat capacity of dilute gases do not agree with the above pre- 
dictions. For example, the value Cy /kg = 7/2, for a diatomic gas such as oxygen, is only 
observed at temperatures higher than a few thousand degrees Kelvin. At room tempera- 
tures, a lower value of 5/2 is observed, while at even lower temperatures of around 10°K, it 
is further reduced to 3/2. The low temperature value is similar to that of a monatomic gas, 
and suggests that no energy is stored in the rotational and vibrational degrees of freedom. 
These observations can be explained if the allowed energy levels are quantized. 

e Vibrational modes: A diatomic molecule has one vibrational mode with stiffness 


K = mw’, where w is the frequency of oscillations. The classical partition function for 


« _ [ dpdq p> mw? 
in= [28 ox [-0( Zo 


this mode is 


(V1.8) 
7 cs 27m 20 — 2a kpT 
a B Bmw} hBw hw’ 
where h = h/27. The corresponding energy stored in this mode, 
Geis ON) A) ee (VL9) 


08 AB p 


comes from kgT/2 per kinetic and potential degrees of freedom. In quantum mechanics, 


the allowed values of energy are quantized such that 
1 
Hyip = hw (» a 5) ; (VI.10) 


120 


with n = 0,1,2,---. Assuming that the probability of each discrete level is proportional to 


its Boltzmann weight (as will be justified later on), there is a normalization factor 


e Bhw /2 


q _ —Bhw(n+1/2) _ 
Ain = Se _ 1 — e~Bhw 


n=0 


(VI.11) 


The high temperature limit, 


1 kpT 
Lit 2 at ee 
poo Vb Bh Tua” 
coincides with eq.(VI.8)(due in part to the choice of h as the measure of classical phase 
space). 


The expectation value of vibrational energy is 


OlnZ hw O hw a hhw 
Gy os = ee BOO Figg VI.12 
vib OG 2 OB Indie ) 2 Ne 1 — e~Bhw ( ) 


The first term is the energy cost of quantum fluctuations that are present even in the zero 
temperature ground state. The second term describes the additional energy due to thermal 


fluctuations. The resulting heat capacity, 


BE? 2 —Bhw 
Cv = eb = kp (= ) 1 — enahmy2 E (VI.13) 


achieves the classical value of kg only at temperatures T >> Oyin, where 6,i5 = hw/kp isa 
characteristic temperature associated with the quanta of vibrational energy. For T < 6yip, 
C4. goes to zero as exp(—Oyip/T). Typical values of 6,;, are in the range of 10? to 104 
degrees Kelvin, explaining why the classical value of heat capacity is observed only at 
higher temperatures. 

e Rotational modes: To account for the low temperature anomaly in the heat capacity of 
diatomic molecules, we have to study the quantization of the rotational degrees of freedom. 
Classically, the orientation of a diatomic molecule is specified by two angles @ and @, and 


its Lagrangian (equal to the kinetic energy) is 
Le ore ee 
L=5(6 + sin 06°), (VI.14) 
where J is the moment of inertia. In terms of the conjugate momenta, 


OL - aL 


= = 10; = —~ =[sin’6 ¢, VI.15 
ea Dé id ) ( ) 


121 


the Hamiltonian for rotations is 


i Ps LE? 
Hic lp, — a V1.16 


where L is the angular momentum. From the classical partition function, 


27 a j Ps 
= do d dpod — 
Zrot 7 vl h ~ |. Podp¢g exp po + an 6 


(VI1.17) 

QrI _ 2kpT 

a : a 

the stored energy is 
OnZ_ O Bh? 

Exot) © = —-——— In = ket, VL18 
(Brot)® = = Son (FE) = ke (VI.18) 
as expected for two degrees of freedom. In quantum mechanics, the allowed values of 
angular momentum are quantized to [2 = h7e(e +1) with @=0,1,2,---, and each state 
has a degeneracy of 2@ + 1 (along a selected direction, L, = —@,---,+€). A partition 


function is now obtained for these levels as 


Zin, = Sox - eee OVE) = Yeo|- “ol h)) e+ 0), (VI.19) 


where 6,6 = h? /(2Ikg) is a characteristic temperature associated with quanta of rotational 
energy. While the sum can not be analytically evaluated in general, we can study its high 
and low temperature limits: 


(a) For T > 6,04, the terms in eq.(VI.19) vary slowly, and the sum can be replaced by the 


integral 
- Gus 1 
jim Des =i dx(2xz +1) exp pa] 
oe zy (VI.20) 
-[ dy e rot y/T — 0 —. Listy 

0) rot 
i.e. the classical result of eq.(VI.17) is recovered. 
(b) For T < Oot, the first few terms dominate the sum, and 

jn Ze =1t Be APret/ Ty O(e O%r0t/P) (VI.21) 


122 


leading to an energy 


OlnZ O 
q => ODO —26ro¢/T ~~ —26r0¢/T : 
rot DB DB In E + 3e 6K BO rote (VI.22) 


The resulting heat capacity vanishes at low temperatures as 


E! OPN 
Crot = aaa = 3kp (= :) ero /P 4. (V1.23) 


Typical values of 6,4 are between 1 and 10 °K, explaining the lower temperature shoulder 
in the heat capacity measurements. At very low temperatures, the only contributions come 
from the kinetic energy of the center of mass, and the molecule behaves as a monatomic 
particle. (The heat capacity vanishes at even lower temperatures due to quantum statistics, 


as will be discussed in the context of identical particles.) 


VI.B_ Vibrations of a Solid 


Attractive interactions between particles first lead to condensation from a gas to liquid 
at low temperatures, and finally cause freezing into a solid state at even lower temperatures. 
For the purpose of discussing its thermodynamics, the solid can be regarded as a very large 
molecule subject to a Hamiltonian similar to eq.(VI.1), with n = N > 1 atoms. We can 
then proceed with the steps outlined in the previous section. 

(a) The classical ground state configuration of the solid is obtained by minimizing the po- 
tential V. In almost all cases, the minimum energy corresponds to a periodic arrangement 
of atoms forming a lattice. In terms of the three basis vectors, 4a, b, and ¢, the locations of 


atoms in a simple crystal are given by 
q*(0,m,n) = \¢a +mb+né|=7, (V1.24) 


where {f,m, n} is a triplet of integers. 


(b) At finite temperatures, the atoms may undergo small deformations 


Gre=r+iu(r), (VI.25) 
with a cost in potential energy of 
ee roma’, s zs ‘ 
VaV"+5 pe Bandar pe”) ug(’) + O(u). (VI.26) 


Tr 


a, 8 


123 


(c) Finding the normal modes of a crystal is considerably simplified by its translational 
symmetry. In particular, the matrix of second derivatives only depends on the relative 


separation of two points, 


a7yV 
——__ = K,,(rF— 7"). VI.27 
a ae al ) ( ) 


It is always possible to take advantage of such a symmetry to at least partially diagonalize 


the matrix by using a Fourier basis, 


ual?) = > tig (R). (VI.28) 


The sum is restricted to wavevectors k inside a Brillowin zone. For example, in a cu- 
bic lattice of spacing a, each component of k is restricted to the interval [—a/a, m/a]. 
This is because wavevectors outside this interval carry no additional information as 
(ke + 2nm/a)(na) = k,(na) + 2mnz, and any phase that is a multiple of 27 does not 
effect the sum in eq.(VI.28). In terms of the Fourier modes, the potential energy of defor- 
mations is 

V= Vito a Kap (F — Fe Fig (Be F tig (k’). (VI.29) 

(7,7'),(k,k") 


We can change variables to relative and center of mass coordinates, 


p= -el Gud fa 
2 
by setting ” | 
F=R+2, and 7 = R-F. 
a 2 
Eq.(VI.29) now simplifies to 
v=v* ton Sere 2 Koa(?) yeilF-k) 7/24, (R)tig(k’) |. (VI.30) 
Bk ON 
ap 


As the sum in the first brackets is Nog gr, 6 


ka8 LP (VI.31) 
=V'+i S> Kaalk )tialk Jiialk)*, 
k,a,8 


where Kap(k) = 7 Kas(P) exp(ik - f°), and tig(k)* = iig(—k) is the complex conjugate 
of tig(k ). 

The different Fourier modes are thus decoupled at the quadratic order, and the task 
of diagonalizing the 3N x 3N matrix of second derivatives is reduced to diagonalizing the 
3 x 3 matrix Kap(k ) separately for each k. The from of Kap is further restricted by 
the point group symmetries of the crystal. The discussion of such constraints is beyond 
the intent of this section and for simplicity we shall assume that Kap(k ) = ba,6K (k), is 
already diagonal. (For an isotropic material, this implies a specific relation between bulk 
and shear moduli.) 


The kinetic energy of deformations is 


Ne _ Ws py s oye em ae 
De eT = De Gita lh tial) = DP Bal Balke)" (V1.32) 
i=l k,a kya 
where F 
~ L Loo 
Palk ) = =—s~ = MUQ(k), 
Palk ) i. (b) (k) 


x Ls ie. eeie e) [2 
H=V + | [ao] +S fn YY, (VI.33) 
ka 
describes 3N independent harmonic oscillators of frequencies wa(k) = \/ K(k )/m. 


In a classical treatment, each harmonic oscillator of non-zero stiffness contributes kpT 
to the internal energy of the solid. At most, 6 of the 3N oscillators are expected to have 
zero stiffness (corresponding to uniform translations and rotations of the crystal). Thus, 
up to non-extensive corrections of order 1/N, the classical internal energy associated with 
Hamiltonian (VI.33) is 3NkgT, leading to a temperature independent heat capacity of 
3kp per atom. In fact, the measured heat capacity vanishes at low temperatures. We can 
again relate this observation to the quantization of the energy levels of each oscillator, as 
discussed in the previous section. Quantizing each harmonic mode separately, gives the 


allowed values of the Hamiltonian as 
Hi =V" + S~ hwo(k) Ge + 5) (VI.34) 
Ka 


125 


where the set of integers {nz _,} describes the quantum micro-state of the oscillators. Since 


the oscillators are independent, their partition function, 


=e 2 = = Walk nz — 1 
ie ee prt EPL See CD er * T]|—— |, (VI.35) 


> 


Lhe at kya NB a k,a 


is the product of single oscillator partition functions such as eq.(VI.11). (Eo includes the 
ground state energies of all oscillators in addition to V”.) 


The internal energy is 


E(T) = (H*) = Ey + > hwa(k) (nak )) (V1.36) 
koa 


where the average occupation numbers are given by 


- case —Bhwa(k)n l 
(na(B)) = 2nso™e  ___? _y, (a) 
Se pera e O(Bhwalk)) \1—e-Phwalk) 


1 — e-Bhwa(k) = ebhiwalk) 1° 


(VI.37) 


Asa first attempt at including quantum mechanical effects, we may adopt the Einstein 
model in which all the oscillators are assumed to have the same frequency wg. This 
model corresponds to atoms that are pinned to their ideal location by springs of stiffness 


K =0°V/0¢ = mw%. The resulting internal energy, 


hw pe Phe 


and heat capacity, 
dE TaN <ectel 
C= — = 3N ke | — }) —————_ VI.39 
dT 5 ( fh ) (i= e-Tn/T)* ( ) 


is simply proportional to that of a single oscillator (eqs.(VI.12) and (VI.13)). In particular, 
there is an exponential decay of the heat capacity to zero with a characteristic temperature 
Tr = hwg/kp. However, the experimentally measured heat capacity decays to zero much 
more slowly, as T°. 

The discrepancy is resolved through the Debye model, which emphasizes that at low 
temperatures the main contribution to heat capacity is from the oscillators of lowest fre- 
quency that are the most easily excited. The lowest energy modes in turn correspond to 


smallest wavevectors k = |, or longest wavelengths \ = 27/k. Indeed, the modes with 


126 


k =0 simply describe pure translations of the lattice and have zero stiffness. By conti- 


nuity we expect, limj_) K (k ) = 0, and ignoring considerations of crystal symmetry, the 


expansion of Kv (k ) at small wavevectors takes the form 
K(k) = Bk? + O(k‘). 


The odd terms are absent in the expansion, since K(k ) = K(—k) follows from K (?—7’) = 


K(r’ —7) in real space. The corresponding frequencies at long wavelengths are 


w(k) = joe = vk, (VI.40) 


where v = ,/B/m is the speed of sound in the crystal. (In a real material, Kop is not 
proportional to dg, and different polarizations of sound have different velocities.) 
The quanta of vibrational modes are usually referred to as phonons. Using the dis- 


persion relation in eq.(VI.40), the contribution of phonons to the internal energy is 
huk 
(H1) = Eo+ es Se (V1.41) 
k,a 


With periodic boundary conditions in a box of dimensions Lz x Ly x Lz, the allowed 


wavevectors are 


= 27N» 2N, 2n 
k= cipal 2 e VI.42 
(A a), (VI.42) 


where nz, Ny, and nz are integers. In the large size limit, these modes are very densely 


packed, and the number of modes in a volume element dk is 


_ dk, dky dke — V 
 Qn/Le Wn/Ly W/L, (2n)8 


dk = pdk. (VI.43) 


Using the phase space density p, any sum over allowed wavevectors can be replaced by an 


integral as 


Jim DFE) - [ek or@), (VI.44) 


k 


Hence, eq.(VI.41) can be re-written as 


B.Z. 37. 
d?k huk 


127 


where the integral is performed over the volume of the Brillouin zone, and the factor of 3 
comes from assuming the same sound velocity for the three polarizations. 

Due to its dependence on the shape of the Brillouin zone, it is not possible to give 
a simple closed form expression for the energy in eq.(VI.45). However, we can examine 
its high and low temperature limits. The characteristic temperature separating the two 
limits, 
— hukmax hvu « 


Tp= ~~ 1.4 
2 kp kp a’ es 6) 


corresponds to the high frequency modes at the edge of the zone. For T >> Tp, all modes 
behave classically. The integrand of eq.(VI.45) is just kg7T, and since the total number 
of modes is 3N = 3V lice d?k/(27)3, the classical results of E(T) = Ey + 3NkgT, and 
C = 3Nkgz are recovered. For T < Tp, the factor exp(Ghvk) in the denominator of 
eq.(VI.45) is very large at the Brillouin zone edge. The most important contribution to 
the integral comes from small k, and the error in extending the integration range to infinity 
is small. After changing variables to x = Bhv|k|, and using 23k = Anx*dx/(Bhv)? due to 
spherical symmetry, eq.(VI.45) gives 


pe ee 3 
lim B(T) ~ Ba + Ss (20) inky? | dx— ; 
0 


T<Tp 8732 \ hu er — 
: (V1.47) 
Lipo ye EE) ees 
Sas A) hv — 


(The value of the definite integral, 74/15 ~ 6.5, can be found in standard tables.) The 


resulting heat capacity, 


C (VI.48) 


aT BON Iw 


_gE _, 4,29? (Ser) 
has the form C « Nkg(T/Tp)° in agreement with observations. The physical inter- 
pretation of this result is the following: At temperatures T < Tp, only a fraction of the 
phonon modes can be thermally excited. These are the low frequency phonons with energy 
quanta hw(k) < kgT. The excited phonons have wavevectors |k| < k*(T) ¥ (kpT/hv). 
Quite generally, in d space dimensions, the number of these modes is approximately 
Vk*(T)? ~ V(kpT/hv)¢. Each excited mode can be treated classically, and contributes 
roughly kgT to the internal energy which thus scales as E ~ V(kgT'/hv)’kgT. The 
corresponding heat capacity, C ~ Vkg(kgT/hv)?, vanishes as T?. 


128 


VI.C_ Black-body Radiation 


Phonons correspond to vibrations of a solid medium. There are also (longitudinal) 
sound modes in liquid and gas states. However, even “empty” vacuum, can support fluc- 
tuations of the electromagnetic (EM) field, photons, which can be thermally excited at 
finite temperatures. The normal modes of this field are EM waves, characterized by a 
wave-number k, and two possible polarizations a. (Since V - E = 0 in free space, the elec- 
tric field must be normal to k, and only transverse modes exist.) With appropriate choice 


of coordinates, the Hamiltonian for the EM field can be written as a sum of harmonic 


= 5S [leo 


kya 


oscillators, 


” + wal)? wath) | (VI1.49) 


> 


with w.(k) = ck, where c is the speed of light. 

With periodic boundary conditions, the allowed wavevectors in a box of size L are 
k= 27m (Nz, Ny, rz)/L where {nz,n,,nz} are integers. However, unlike phonons, there is 
no Brillouin zone limiting the size of k, and these integers can be arbitrarily large. The 
lack of such a restriction leads to the ultraviolet catastrophe in a classical treatment: As 
there is no limit to the wavevector, assigning kg7 per mode leads to an infinite energy 
stored in the high frequency modes. (The low frequencies are cut off by the finite size of 
the box.) It was indeed to resolve this difficulty that Planck suggested that the allowed 


values of EM energy must be quantized according to the Hamiltonian 
a | 5 
= LS ck (nal) + 5) 2. “with: “agile 0.1 2 eee: (VI.50) 
Rox 
As for phonons, the internal energy is calculated from 
e Bhck 


1 2V 32 Ack 
ka 


The zero-point energy is actually infinite, but as only energy differences are measured, it is 


usually ignored. The change of variables to x = Ghck allows us to calculate the excitation 


E* he (kpT de dx x 
VV w\ he 9 et —-1l 


1 kel 7 
= tf See Tp 
15 ( he ) Kp 


energy, 


(VI.52) 


129 


The EM radiation also exerts a pressure on the walls of the container. From the 


partition function of the Hamiltonian (VI.50), 


Z= S> exp shui) (ral) + 5) =T] casas (V1.53) 


{na(k)} kav 


the free energy is 


F=—kpTlnZ= kpT \~ a +1n (1 = ana) 
a (VI.54) 


Bk [hck 
= | Tins Oey. 
Alcoa Ake eee ) 


The pressure due to the photon gas is 


OF dek 
P=- — ——— es IkeT ln (1 — —Bhck 
av |, lop [fick + B n( € )| 

= Po - ss / dk k* In (1 — eons) (integrate by parts) 
Be hse (VI.55) 
kel [f° = -k? Ghee" ; 

=Por+ 2 i d ee eee) (compare with eq.(VI.51)) 
1E 

apa 

ay 


Note that there is also an infinite zero—point pressure Po. Differences in this pressure lead 
to the Casimir force between conducting plates, which is measurable. 

The extra pressure of 1/3 times the energy density can be compared to that of a gas 
of relativistic particles. (As shown in the test problems, a dispersion relation, € « |p|*, 
leads to a pressure, P = (s/d)(E/V) in d dimensions.) Continuing with the analogy to a 
gas of particles, if a hole is opened in the container wall, the escaping energy flux per unit 


area and per unit time is 


b= (c1) a (VI.56) 


All photons have speed c, and the average of the component of the velocity perpendicular 
to the hole can be calculated as 


1 e 
{Cae ie 27 sin 0d cos 0 = 7? (VI.57) 
0 


resulting in 
1. ee OR he 


Sie ee VL58 
4°V 60 Abe ( ) 


a) 


130 


The result, ¢ = oT’, is the Stephan—Boltzmann law for blackbody radiation, and 


= 5.67 x 10-8Wm-°K~4 VI.59 
Oo = 60 Roce ~ oO. x m 5 ( 3 ) 
is Stephan’s constant. Blackbody radiation has a characteristic frequency dependence: Let 


E(T)/V = Jf dkE€(k,t), where 


he k3 


E(k, T) al ae. ebhick _ 1? 


(VI.60) 


is the energy density in wavevector k. The flux of emitted radiation in the interval [k, k+dk] 
is [(k,T)dk, where 


4 hee 13 ckpTk? /4n fork < k*(T) 
sé(k,T) = 5 a . (VL61) 


I(k,T) = = 
(k, T) Aq? ePhck _ 1 hc2k8e—PRck /An2 for k > k*(T) 


The characteristic wavevector k*(T) ~ kgT/hc separates quantum and classical regimes. 


It provides the upper cutoff that eliminates the ultraviolet catastrophe. 


131 


VI.D Quantum microstates 


In the previous sections we indicated several failures of classical statistical mechanics, 
which were heuristically remedied by assuming quantized energy levels, while still calculat- 
ing thermodynamic quantities from a partition sum Z = 5°, exp (—GE,,). This implicitly 
assumes that the micro-states of a quantum system are specified by its discretized energy 
levels, and governed by a probability distribution similar to a Boltzmann weight. This 
‘analogy’ to classical statistical mechanics needs to be justified. Furthermore, quantum 
mechanics is itself inherently probabilistic, with uncertainties unrelated to those that lead 
to probabilities in statistical mechanics. Here, we shall construct a quantum formulation 
of statistical mechanics by closely following the steps that lead to the classical formulation. 

Micro-states of a classical system of particles are described by the set of coordinates 
and conjugate momenta {p;,q@;}; i.e. by a point in the 6N-—dimensional phase space. In 
quantum mechanics {q;} and {p;} are not independent observables. Instead: 

e The (micro-) state of a quantum system is completely specified by a unit vector |W), which 
belongs to an infinite dimensional Hilbert space. The vector |W) can be written in terms 
of its components (n|V), which are complex numbers, along a suitable set of ortho-normal 
basis vectors |n). In the convenient notation introduced by Dirac, this decomposition is 


written as 


|Y) = S(nfW) |n). (V1.62) 


> 


The most familiar basis is that of coordinates | {g;}), and ({q;}|V) = U(@,...,dy) is the 


wave-function. The normalization condition is 


(YY) = S(Wfn) (nf¥) =1, where (|n) = (nfW)*. (V1.63) 


n 


For example, in the coordinate basis, we must require 


N 
(U|v) = flea iwa....av))? =1. (VI.64) 
i=1 


e Classically, various observables are functions O({p;, g;}), defined in phase space. In 
quantum mechanics, these functions are replaced by Hermitian matrices (operators) in 
Hilbert space, obtained by substituting operators for {g;} and {p;} in the classical ex- 
pression (after proper symmetrization of products, e.g. pq — (pq + qp)/2). These basic 


operators satisfy the commutation relations, 
= h 
[P35 9x] = Pidk — GkPj = 7 Oj,k - (VI1.65) 


132 


For example, in the coordinate basis | {g;}), the momentum operators are 


ho 
See VI.66 
(Note that classical Poisson brackets satisfy {p;,q,} = —6;,,- Quite generally, quantum 


commutation relations are obtained by multiplying the corresponding classical Poisson 
brackets by ih.) 
e Unlike in classical mechanics, the value of an operator O is not uniquely determined for 
a particular micro-state. It is instead a random variable, whose average in a state |W) is 
given by 

(O) = (YO|V) = S/(G[m)(m|O|n) (n|W). (V1.67) 


m,n 


For example, 


N 
wap = | [[eavvctayy, aa, 
(K({o})) = / [eae ({Fa}) wv 


To ensure that the expectation value (Q) is real, the operators O must be Hermitian, i.e. 
satisfy 
O'T=O, where (mlO'|n) = (n|O|m)*. (VI1.68) 


When replacing p and ¢ in a classical operator O({pi,G@}) by corresponding matrices, 

proper symmetrization of various products is necessary to ensure the above Hermiticity. 
Time evolution of micro-states is governed by the Hamiltonian H({pj,q}). A clas- 

sical microstate evolves according to Hamilton’s equations of motion, while in quantum 


mechanics the state vector changes in time according to 
240 
tha W(t) = H|V(t)). (V1.69) 


A convenient basis is one that diagonalizes the matrix H. The energy eigen-states satisfy 
H\n) = €,|n), where €,, are the eigen-energies. Substituting |V(t)) = >>, (n|W(t))|nm) in 
eq.(VI.69), and taking advantage of the ortho-normality condition (m|n) = dm.n, yields 


tem 
h 


nL (n|W(H)) = Exin| W(t), = (n} W(t) = exp (~ ) nwo. (VL70) 


133 


The quantum states at two different times can be related by a time evolution operator, as 
|W (t)) = Ut, to)|W(to)), (VL71) 


which satisfies ih0,U (t, to) = HU(t, to), with the boundary condition U(to,to) = 1. If H 


is independent of t, we can solve these equations to yield 


Ue 570 = to). (V1.72) 


VI.E Quantum macrostates 


Macro-states of the system depend on only a few thermodynamic functions. We can 
form an ensemble of a large number WN, of micro-states j1., corresponding to a given macro- 
state. The different micro-states occur with probabilities pg. (For example pg = 1/N in 
the absence of any other information.) When we no longer have exact knowledge of the 
microstate, it is said to be in a mized state. 


Classically, ensemble averages are calculated from 

ees N 

OUT = Pelt) = f TL aAa*Ko (Hi. 4}) Lind }.8), (VETS) 
a w=1 


where 


p({Dis di}, t) t) = 2 Pa I i (Gi; 1 Gilt Ja) &° (Di — pilt)a) ’ (VI.74) 


is the ensemble density. 
Similarly, a mixed quantum state is obtained from a set of possible states {|W.)}, with 
probabilities {p,}. The ensemble average of the quantum mechanical expectation value in 


eq.(VI.67) is then given by 


(0) = So po(ValO|Wa) = S> pa(Valm) (n|Wa)(m|O|n) 


a,m,n 


(VI.75) 
= So (n|p|m)(m|O|n) = tr(pO), 
where we have introduced a basis {|n)}, and defined the density matrix 
(n|p(t) = D1Pa (n|Wa(t))(Ua(t)|m). (V1.76) 


134 


Classically, the probability (density) p(t) is a function defined on phase space. As all 
operators in phase space, it is replaced by a matrix in quantum mechanics. Stripped of 


the choice of basis, the density matrix is 
= So PalVa(t))(Va(t)|- (V1.77) 


Clearly, p corresponds to a pure state if and only if p? = p. 
The density matrix satisfies the following properties: 


(i) Normalization: Since each |W.) is normalized to unity, 


(1) =Ar(p) = S- (n|p|n) = dPa (n|Wa)|* = Sa Sols (VI.78) 


n 


(ii) Hermiticity: The density matrix is Hermitian, i.e. p' = p, since 
(m|p'|n) = (n|p|m)* = Di Pal (Yam) (n|Wa) = (n|p|m), (VI.79) 
ensuring that the averages in eq.(VI.76) are real numbers. 
(iii) Positivity: For any |®), 
(®[p|®) = d Pal &|Va)(Wal®) = Y pal(®|Va)|? > 0. (VI.80) 


Thus p is positive definite, implying that all its eigenvalues must be positive. 


e Liouville’s theorem governs the time evolution of the classical density as 
dp Op 
— = — — {H, p}=0. VI81 
Fre omen ( ) 


It is most convenient to examine the evolution of the quantum density matrix in the basis 


of energy eigen-states, where according to eq.(VI.70) 
O 
tha (nle(t)im) = tha. 5 Pa (n|Wa(t)) (Wa(t)|m) 
= LiPo llEn — Em)(n Ha) Wal) wey) 
= (nl mp — pH)|m). 
The final result is a tensorial identity, and hence, independent of the choice of basis 


a 
inp =[H, pl (V1.83) 


135 


Equilibrium requires time independent averages, and suggests Op/Ot = 0. This con- 
dition is satisfied in both eqs.(VI.81) and (VI.83) by choosing p = p(H). (As discussed 
in chapter III, » may also depend on conserved quantities such that [H,L] = 0.) Vari- 
ous equilibrium quantum density matrices can now be constructed in analogy to classical 
statistical mechanics. 

e Microcanonical ensemble: As the internal energy has a fixed value E, a density matrix 


that includes this constraint is 
d(H — E) 


Q(E) 


In particular, in the basis of energy eigen-states, 


o(E) = (VI.84) 


S i é, = 2; and m= nH; 


(nlp|m) = Spe (n|¥a)(Balm) = (VL85) 
o O. 4b 2,4. Or ei. 

The first condition states that only eigen-states of the correct energy can appear in the 
quantum wave-function, and that (for pz = 1/N) such states on average have the same 
amplitude, l(nJw) |? = 1/Q. This is equivalent to the classical postulate of equal a priori 
equilibrium probabilities. The second (additional) condition states that the Q eigen-states 
of energy E are combined in a typical micro-state with independent random phases. (Note 
that the normalization condition tr p = 1, implies that Q(£) = )°,, 6(E—E,,) is the number 
of eigen-states of 7 with energy EF.) 
e Canonical ensemble: A fixed temperature T = 1/kg(, can be achieved by putting the 
system in contact with a reservoir. By considering the combined system, the canonical 


density matrix is obtained as 


exp (— SH) 
ja ees VL.86 
pp) = (VI.86) 
The normalization condition tr(p) = 1, leads to the quantum partition function 
Z=tr as = Soe Fen, (VL.87) 


The final sum is over the (discrete) energy levels of H, and justifies the calculations per- 
formed in the previous sections. 

e Grand Canonical Ensemble: The number of particles N, is no longer fixed. Quantum 
micro-states with indefinite particle number span a so called Fock space. The density 


matrix is 
e B+ BUN 


p(B, bh) = ra) 


, where QO(f,y) =tr eee) = S- ePHN 7x (3). (VI.88) 
N=0 


136 


Example: Consider a single particle in a quantum canonical ensemble in a box of 
volume V. The energy eigen-states of the Hamiltonian 


32 2 
i= a Fy (in coordinate basis) (VI1.89) 
2m 2m 
obtained from H,|k) = €(k)|k), are 
= eik £ re h2k2 
r = i = ‘ I. 
(Z |k) WW with E(k) Dan (VI.90) 


With periodic boundary conditions in a box of size L, the allowed values of k are 
(27/L) (lx, ly, &), where (lz, ¢y,2,) are integers. A particular vector in this one-particle 
Hilbert space is specified by its components along each of these basis states (infinite in 
number). The space of quantum micro-states is thus much larger than the corresponding 


6-dimensional classical phase space. The partition function for L - o, 
Bh? k? ak Bh? k? 
Z, = tr(p = Deo(-% =~)- v | are (-4*) 


3 
ees 2amkpT \ — V 
~ (27)3 nh? 8? 
coincides with the classical result, with \ = h/W2amkT (justifying the use of d?p' d?q¢/h? 
as the correct dimensionless measure of phase space). Elements of the density matrix in a 


(VI.91) 


coordinate representation are 


& BOER) e 3 Vek —ik -(&—#") Bh? k? 
SI pay ogy & ath € = 
(loké) = ek) Se) = > fo (- 


> 


k ° (VI.92) 
i x mé—-2#')?]_ 1 : _a(k — x’)? 
yo 2Bh? yore A? 


The diagonal elements, (# |p|z%) = 1/V, are just the probabilities for finding a particle at 
Z. The off-diagonal elements have no classical analog. They suggest that the appropriate 
way to think of the particle is as a wave-packet of size \ = h//2mmkgT, the thermal 
wavelength. As T — co, A goes to zero, and the classical analysis is valid. As T — 0, 
A diverges and quantum mechanical effects take over when 4 becomes comparable to the 
size of the box. 
Alternatively, we could have obtained eq.(VI.92), by noting that eq.(VI1.86) implies 
O 
0p 
This is just the diffusion equation (for the free particle), which can be solved subject to 
the initial condition p(3 = 0) = 1 (ie. (#’|p(8 = 0)|%) = 63(# —#')/V) to yield the result 
of eq.(VI.92). 


h2 
—Zp=—-HZp= a oP (VI.93) 


137 


VII. Ideal Quantum Gases 


VII.A Hilbert Space of Identical Particles 


In chapter IV, we discussed the Gibbs paradox for the mixing entropy of gases of 
identical particles. This difficulty was overcome by postulating that the phase space for N 
identical particles must be divided by N!, the number of permutations. This is not quite 
satisfactory as the classical equations of motion implicitly treat the particles as distinct. 
In quantum mechanics, by contrast, the identity of particles appears at the level of allowed 
states in Hilbert space. For example, the probability of finding two identical particles 
at positions #, and #2 is |W(71,%2)|?. Since the exchange of particles 1 and 2 leads to 
2, 


the same configurations, we must have |W(71, 2)|? = |W(%2,#1)|*. For a single-valued 


function, this leads to two possibilities: 


The Hilbert space used to describe identical particles is thus restricted to obey certain 
symmetries. 

For a system of N identical particles, there are N! permutations P, forming a group 
Sn. There are several ways for representing a permutation; e.g., P(1 2 3 4) = (3 2 4 1) 
for N = 4 can alternatively be indicated by 


12 3 4 
P= . (VII.2) 

a a | 
Any permutation can be obtained from a sequence of two particle exchanges. For example, 
the above permutation is obtained by the exchanges (1,3) and (1,4) performed in sequence. 


The parity of a permutation is defined as 


ee +1 if P involves to an even number of exchanges, e.g. (1 2 3) — (2 3 1) 
—1 if P involves an odd number of exchanges, e.g. (1 2 3) — (2 1 3) 


(Note that if lines are drawn connecting the initial an final locations of each integer in 
eq.(VII.2), the parity is (—1) raised to the number of intersections of these lines.) 

The action of permutations on an N—particle quantum state leads to a representation 
of the permutation group in Hilbert space. Requiring the wave-function to be single valued, 


and to give equal probabilities under particle exchange, restricts the representation to be 


138 


either fully symmetric or anti-symmetric. This allows for two types of identical particles 
in nature: 


(1) Bosons correspond to the fully symmetric representation such that 
Plp(,---,.N)) = +10, ---,N)). 
(2) Fermions correspond to the fully anti-symmetric representation such that 
Pib(1,--+,N)) = (-1)" |, +N). 


Of course, the Hamiltonian for identical particles must itself be symmetric, i.e. PH = 
H. However, for a given H, there are many eigen-states with different symmetries under 
permutations. To select the correct set of eigen-states, in quantum mechanics the statistics 
of the particles (bosons or fermions) is specified independently. For example, consider N 


non-interacting particles in a box of volume V, with a Hamiltonian 
N N 2 
2 
_ a= apes : VIL3 
H= UH ( oe v2) (VII3) 


Each H, can be separately diagonalized, with plane wave states {\k )} and corresponding 

energies € (k i hh? k? /2m. Using sums and products of these one-particle states, we can 

construct the following—N particle states: 

(1) The product Hilbert space is obtained by simple multiplication of the one-body states, 
i.e. 


> 


ia,+++ kw)@ = |ki) +++ ey). (VIL.4) 


In the coordinate representation, 


N 
(Z1,°*+,En|k1,-:-, kn)@ NT exP ( ys ke in) (VII.5) 
a=1 
and 
2 2 NR = = 
Hlki,---,kw)@ = (> rca) Rice hye (VIL6) 


But the product states do not satisfy the symmetry requirements for identical particles, 


and we must find the appropriate subspaces of correct symmetry. 


139 


(2) The fermionic subspace is constructed as 


=> => 1 => => 
|R1,-++)k)— = a= D(-1)? Phi, +++, Ewe, (VIL7) 
VN_ “SD 
where the sum is over all N! permutations. Clearly, if any one-particle label k, appears 
more than once in the above list, the result is zero and there is no anti-symmetrized 
state. Anti-symmetrization is possible only when all the N values of ka are different. 


In this case, there are N! terms in the above sum, and N_ = N! is necessary to ensure 
normalization. For example, a two-particle anti-symmetrized state is 

~ (|i, ko) — |e, hr 

Mi, Bp). — Ra» Ra) = [han ha) 


V2 
(If not otherwise indicated, | Risse: kn) refers to the product state.) 
(3) Similarly, the bosonic subspace is constructed as 


lk1,-++, kw) 4 = Plki,---, kw). (VIL.8) 
Tee 


In this case, there are no restrictions on the allowed values of k. A particular one— 


particle state may be repeated nz times in the list, with }); nz = N. As we shall prove 
shortly, proper normalization requires Ny = N!][zn;z!. For example, a correctly 
normalized 3—particle bosonic state is constructed from two one-particle states |a), 
and one one-particle state |3) as (ng = 2, ng = 1, and Ny = 3!2!1! = 12) 


joy) 4 = Fs (lenla)l9) + lon lA la) + |8)la)la) + la)la)lS) + \Prle)la) + le)19)le)) 


(|x) |a) |B) + |a)|B)|a) + |8)|a)|a)) . 
e It is convenient to oe bosons and fermions simultaneously by defining 


EY, = om Son? Pite}), with n= {+ BEDE DORONSS (VIL9) 


—1 for fermions 
Each state is uniquely specified by a set of occupation numbers {nz}, such that }i¢ ng = N 


als 


v] 


and 
(1) For fermions, |{k})_ = 0, unless nz =0or 1, and N_ = NIJ], ng! = N!. 
(2) For bosons, any k may be unit nz times, and the normalization is calculated from 
+({k}{k}) 4. = oe (P{k}|P’{k}) = x, RNP AAD 
+ pip 


= N! [lg ne! 
— a 


(VIL.10) 
=1, > N,=M [nj 
k 
(The term({k}|P{k}) is zero unless the permuted k’s are identical to the original set, 
which happens [[z nz! times.) 


140 


VII.B Canonical Formulation 


Using the states constructed in the previous section, we can calculate the canonical 
density matrix for non-interacting identical particles. In the coordinate representation we 


have 


({2"}lol{a})n ay; or (e7IP ENE PBIB 5 (VIL11) 


{ky P,P! 


where o({k}) = exp |-B (Ss 2 /2m)| /Zn. The sum, ree see is restricted to 
ensure that each identical particle state appears once and only once. In both the bosonic 
and fermionic subspaces, the set of occupation numbers {nz} uniquely identify a state. 
We can, however, remove this restriction from eq.(VII.11), if we divide by the resulting 


over-counting factor (for bosons) of N!/([[z nj!), ie., 


(Note that for fermions, the (—1)” factors cancel out the contributions from cases where 
any nz is larger than one.) Therefore, 
[Iz-nz! 1 
=) > var kik’. P 
C2 VeD = aa 
{k} 


(VIL.12) 


“Ak? ne 
ui exp exe (89) ria er ery Plea, 


P,P! 


In the limit of large volume, the sums over {k} can be replaced by integrals, and using the 


plane wave representation of wavefunctions, we have 


{2 }lol{z}) = an Fane on ne oe 2 on (5) 


P,P! 


u | (VII.13) 
exp l-i yo lhe ee ne Z| 


x VN 


We can order the sum in the exponent by focusing on a particular k-vector. Since 


val (Pa)g(a) = Vig f(8)g(P*8), where @ = Pa and a = P~', we obtain 


/ 5k, —ik lg —i' —Bh7k? /2m 
(WIE = Zap II / dh ia Ertan) | 
7 (VII.14) 


141 


The gaussian integrals in the square brackets are equal to 


1 T /_, sy 2 
Rae “Se (fpr =a 545) : 


Setting @ = P~‘ta in eq.(VII.14) gives 


N 
({z"} pla) = FNS Pal exp | 3° (Fs —Fipy) |. (VILAS) 
P,P! B=1 
Finally, we set Q = P’~1P, and use the results 7? =? ", and n@ = 7?" 'P = 7P'n?, to 
get (after performing 5), = N!) 
1 T = 2 
(WED = Zam DI [Fs Lees)" (TTA 


The canonical partition function, Zy, is obtained from the normalization condition 


N 
in(o)=1, > f TP aes (eaplol(B) =1, 


i 7 " A 
Zw = span f [Lat Do nPexp ~x2 (Zs — Zap) | - (VIL.17) 


The quantum partition function thus involves a sum over N! possible permutations. The 
classical result Zy = (V/ eye /N!}, is obtained from the term corresponding to no particle 
exchange, Q = 1. The division by WN! finally justifies the factor that was (somewhat 
artificially) introduces in classical statistical mechanics to deal with the phase space of 
identical particles. However, this classical result is only valid at very high temperature 
and is modified by the quantum corrections coming from the remaining permutations. 
As any permutation involves a product of factors exp|[—m(Z1 — %2)?/A?], its contributions 
vanishes as A — 0 for T’ — oo. 

The lowest order correction comes from the simplest permutation which is the ex- 
change of two particles. The exchange of particles 1 and 2 is accompanied by a factor of 
n exp|—27(Z — %)?/A?]. As each of the possible N(N — 1)/2 pairwise exchanges gives the 


same contribution to Zy, we get 


N 
1 7 N(N —1) ae 
ee ar | [I *7. " + ——nexp -3 Gis ia) re | _ (VII.18) 


142 


For any a > 2, {d°%, = V; in the remaining two integrations we can use the relative, 


> 


12 = £2 — £1, and center of mass coordinates to get 


1 N NUN = 1) 3 (-2nF?2,/r 
Zn = Wan 14 SE, [ rive wa/r 4... 

3 VIL19 
ee fg ND) amr \ ( ) 
~ NTS 2V an} | 

From the corresponding free energy, 
PS har ag ner | ee (VII.20) 
= = _— n —_ > _ et i 
Bean BS ay CURR 


the gas pressure is computed as 


OF| | NkpT N*kpT 9 
OViie WV V2 28/2 


aa ee | _ (VIL21) 


Note that the first quantum correction is equivalent to a second virial coefficient of 


nr° 


Bo = ~ 95/2" 


(VII.22) 


The resulting correction to pressure is negative for bosons, and positive for fermions. In the 
classical formulation, a second virial coefficient was obtained from a two-body interaction. 
The classical potential V(7’) that leads to the second virial coefficient in eq.(VII.22) is 


obtained from 


f(r) = e BV (F) a ee nenerr iN ane, 


2 = (VII.23) 
V(r) = —kpgT In E + ne w2nr |” = —kpTn ere 


(The final approximation corresponds to high temperatures, where only the first correc- 
tion is important). Thus the effects of quantum statistics at high temperatures are ap- 
proximately equivalent to introducing an interaction between particles. The interaction is 
attractive for bosons, repulsive for fermions, and operates over distances of the order of 


the thermal wavelength 4X. 


143 


VII.C Grand Canonical Formulation 


Calculating the partition function by performing all the sums in eq.(VII.17) is a 


formidable task. Alternatively, we can compute Zy in the energy basis as 


/ N / 
Zy =tr Ga = 3 exp -2 320) - ra —BS~ E(k)n(k)|.  (VIL24) 
{Ra} ost NE k 


These sums are still difficult to perform due to the restrictions of symmetry on the allowed 
values of k or {nz}: The occupation numbers {nz} are restricted to )ipng = N, and 
ng = 0,1,2,--- for bosons, while nz = 0 or 1 for fermions. As usual, the first constraint 


can be removed by looking at the grand partition function, 


Q,(T, 1) = S- e?*N S~ exp |B > Elk)ng 
N=0 inal f (VII.25) 


” 
= »., [[ex |-(E(R) — Hn | i 
{ng} k 
The sums over {nz} can now be performed independently for each k, subject to the re- 


strictions on occupation numbers imposed by particle symmetry. 


e For fermions, nz = 0 or 1, and 


Qo-=]] E + exp (Bp — Be(h)) (VII.26) 
k 
e For bosons, nz = 0,1,2,---, and summing the geometric series gives 
0,=]] E — exp (Bu — BE(R)) : (VII.27) 
ké 


The results for both cases can be presented simultaneously as 


InQ, = —-7 S- In E — nexp (Bp — Be(R)) ; (VII.28) 
k 


with 7 = —1 for fermions, and 7 = +1 for bosons. 
In the grand canonical formulation, different one-particle states are occupied indepen- 


dently, with a joint probability 
Pn ({n(&)}) = + [[exp |-a(E(R) - Hn | , (VII.29) 
TR 


144 


The average occupation number of a state of energy € (k) is given by 
O01 1 
Ce ce (VII.30) 
O(BE(k)) —-z-1e8E® — 
where z = exp((Gu). The average values of the particle number and internal energy are 


ii 
Ny =Si(ng)n -- > pepe a 7 


> > 


then given by 


‘ sh) (VIL31) 
Ey =D LE(R)ing)n = ee ar 
k 


VII.D Non-relativistic Gas 


Quantum particles are further characterized by a spin s. In the absence of a magnetic 
field different spin states have the same energy, and a spin degeneracy factor, g = 2s + 1, 
multiplies eqs.(VII.28)—(VII.31). In particular, for a non-relativistic gas in three dimen- 
sions (E(k) = h?k?/2m, and EV f Bk/(2)3) these equations reduce to 


l dbk hk? 


=5f 1 

(2)? 2-1 exp (BH) — 9 (VIL32) 
; 7 = 1 

VES eh Be eSB) =a 

To simplify these equations, we change variables to « = Gh7k?/(2m), so that 


V 2mkpl 1/2 ae anil? 3/2 qi/2 
h A 


k= — dk = S— 9 de. 


Substituting into eqs.(VII.32) gives 


(VII.33) 


2G ee “ae 
DB Vr Jo 2 tet@ — 9 


145 


BEn 


We now define two sets of functions by 


1 eS ge) 
2) = —_—_—.. VII.34 
fle=—aay | Ste (VIL.34) 
For non-integer arguments, the function m! = I(m + 1) is defined by the integral 


Jo dez™e-*. In particular, from this definition it follows that (1/2)! = 7/2, and 
(3/2)! = (3/2)./7/2. Eqs.(VII.33) now take the simple form 


r3 
g 

Mn = 53 F3/2(); (VII.35) 
3 

En = gh 


These results completely describe the thermodynamics of ideal quantum gases as a function 
of z. To find the equation of state P,(n,,7T), we need to solve for z in terms of density. 
This requires knowledge of the behavior of the functions f7” (z). 

The high temperature, low density (non-degenerate) limit will be examined first. In 


this limit, z is small, and 


7 eT (ae ae aed Cee, ea! (ze-*) (1—nze~®) 
ne=——oy | caf eee" (20) (1 — nee) 


zg lex —1n 
1 sa _ 
— d m—1 zee a atl 
a= i 2 ya 


Q 
+ 
— 
XR 
Q 
= 
i oe 
= 
Sa 
8 
= 
8 
3 
ue 
- 
Q 
8 


a=1 

yt yet roi A 
= ==. = pee — — —— 

a" m 5m 3™ am 


(VII.36) 
We thus find (self-consistently) that f7)(z), and hence n,(z) and P,(z), are indeed small 
as z > 0. Eqs.(VII.35) in this limit give, 


Nn r? a ad re 
— = f3(2) =2+n 372 + aaa tT TZa7a tO 
g 2 3 4 
Pre eee . (VIL37) 
WO = eh = 
, = fejalt) =2 +555 + ge t Ig t 


The first of the above equations can be solved perturbatively, by the recursive procedure 


of substituting the solution up to a lower order, as 


g "53/2 33/2 


Tiexe n (nr? 
- (22) -sh (22) - vo 
= i ee, Nr? : is iy tol ig A” i 7 
a g 93/2 g 4 33/2 g 
Substituting this solution into the second leads to 
BP _ (mr ?\ an ( mdr° : " ee cd Ny r®\° 
g g 93/2 g 4 33/2 g 
(mB) 1 (mr VP L (mr i)" 
95/2 g 8 g 35/2 g ? 
The pressure of the quantum gas can thus be obtained from the virial expansion, 
2 
| Ryd” ae eee Myr” aes 
95/2 g 8 35/2 g 


The second virial coefficient By = —nd°*/(25/%g), agrees with eq.(VIL22) computed in 


P, = nykpT (VII.39) 


the canonical ensemble for g = 1. The natural (dimensionless) expansion parameter is 
n,A®/g, and quantum mechanical effects become important when n,,\? > g; the quantum 
degenerate limit. The behavior of fermi and bose gases is very different in this degenerate 
limit of low temperatures and high densities, and the two cases will be discussed separately 


in the following sections. 


147 


VII.E The Degenerate Fermi Gas 
At zero temperature, the fermi occupation number, 


1 
hae — (VIL40) 
. 08 (E(e)-n) + 1 


=> 


is one for E(k) < pu, and zero otherwise. The limiting value of 41 at zero temperature is 
called the fermi energy, Er, and all one-particle states of energy less than € ¢ are occupied, 
forming a fermi sea. For the ideal gas with € (k) = hk? /(2m), there is a corresponding 


fermi wavenumber kp, calculated from 


k<kp Pk V . 
N= S- (2s+1)= ov | (an) = 95 ak r- (VII.41) 
|k|<kp 
In terms of the density n = N/V, 
6r2n\ 1/3 21.2 2 2,,\ 2/3 
ke=(=") 0 ee = EE= SS (S*) vray) 
g m 2m g 


Note that while in a classical treatment the ideal gas has a large density of states at 
T = 0 (from Q¢hassical = V/N!), the quantum fermi gas has a unique ground state with 
Q = 1. Once the one-particle momenta are specified (all k for |k| < kp), there is only one 
anti-symmetrized state, as constructed in eq.(VIL.7). 

To see how the fermi sea is modified at small temperatures, we need the behavior of 


f,, (2) for large z which, after integration by parts, is 


7 Oh Peo dl =i 
fal) == | ane = (==5): 


Since the fermi occupation number changes abruptly from one to zero, its derivative in the 
above equation is sharply peaked. We can expand around this peak by setting x = Inz +t, 


and extending the range of integration to —oo < t < +00, as 


2 


#20) — ” at ane +4)" 5 (2) 


et ee dt \et' +1 
eae bs (MN wagon —1 

= anf (2) one ast (VI.43) 
(ig) = ml! 


! 2a f* ond AA 
~ ml Bc ae fiw a (ae): 


148 


Using the (anti-) symmetry of the integrand under t > —t, and un-doing the integration 


by parts yields, 


0 for a odd, 


a cere ae 
al. aa (=) a pee [a ee 
Uaee gee (a — D! ; : = or @ even. 


Inserting the above into eq.(VII.43), and using tabulated values for the integrals f; (1), 


leads to the Sommerfeld expansion, 


Jim fin( => ZF all a (In z)~* 
(in) 1 nm ae Tx* mm — 1)(m —2)(m—3) | 
ml! 6 (nz)? 360 (lin 2)* 
(VII.44) 
In the degenerate limit, the density and chemical potential are related by 
nr _ (Inz)9/? nm? 31 _9 
pe = mae ee | ee i II.4 
; f3ja(2) (3/2)! — 629 (Inz)~“ + | > (VII.45) 


The lowest order result reproduces the expression in eq.(VII.41) for the fermi energy, 


37 2/3 2 2, \ 2/3 
lim Inz = ae nw = pi Susie = BE m. 
P36 4./nm g 2m g 


Inserting the zero temperature limit into eq.(VII.45) gives the first order correction, 


—2/3 2 
nr? (kpT\? a (kpT 
ee EE) ot, = hae Bey. Beeslle. Sd 
<5 ( Ep ) sg ae 12 ( Ep 7 cma 


The appropriate dimensionless expansion parameter is (kgT/Er). Note that the fermion 


Inz = BEpr 


chemical potential » = kpgT|nz, is positive at low temperatures, and negative at high 
temperatures (from eq.(VII.38)). It changes sign at a temperature proportional to Er/kp. 


The low temperature expansion for the pressure is 


Inge? 253 
ap =F fle) = SOI fie SF omg? e 
2 


= g &(GER)?? 1 2 mw (keT A 
~ 38 15/r 212 \ Ep (VII.47) 


5 kay? 
[ish ae op wh 
+ 757 (=) aT | 


149 


2 Pe 


where Pr = (2/5)n€ rp if the fermi pressure. Unlike its classical counterpart, the fermi gas 
at zero temperature has finite pressure and internal energy. 
The low temperature expansion for the internal energy is obtained easily from 
eq.(VII.47) using 
3 


E 3 
ae nae aia 


5 ee 
eee ee Be 11.4 
+ 757 (z=) + | (VII.48) 


where we have introduced the fermi temperature Tp = Er/kg. Eq.(VII.48) leads to a low 


temperature heat capacity, 


dE 7 fe TN 
es Ee = ae IL.4 
ginteet ne) vita) 


The linear vanishing of the heat capacity as T — 0 is a general feature of a fermi gas, valid 
in all dimensions. It has the following simple physical interpretation: The probability 
of occupying single-particle states, eq.(VII.40), is very close to a step function at small 
temperatures. Only particles within a distance of approximately kpT of the fermi energy 
can be thermally excited. This represents only a small fraction T/T, of the total number 
of electrons. Each excited particle gains an energy of the order of kgT, leading to a 
change in the internal energy of approximately kgTN(T/Tr). Hence the heat capacity 
is given by Cy = dE/dT ~ NkgpT/Tr. This conclusion is also valid for an interacting 
fermi gas. The fact that only a small number, N(T'/Tr), of fermions are excited at small 
temperatures accounts for many interesting properties of fermi gases. For example, the 
magnetic susceptibility of a classical gas of N non-interacting particles of magnetic moment 
Lp follows the Curie law, y x Nu2,/(kgT). Since quantum mechanically, only a fraction 
of spins contributes at low temperatures, the low temperature susceptibility saturates to a 


Pauli) value of y x Ny2,/(kpTp). (See review problems for the details of this calculation. 
B 


150 


VII.F The Degenerate Bose Gas 


The average boson occupation number, 


it 


= f Gio 5) = (VII.50) 


(nz)+ = 


must be always positive. This requires € (k) — yp to be positive for all k, and hence pw < 
min Eo) as 0 (for E(k) = h?k?2/2m). At high temperatures (classical limit), ju is large 
and negative, and increases towards zero as —kgT In(nA?/g) as temperature is reduced. 
In the degenerate quantum limit, 4: approaches its limiting value of zero. To see how this 
limit is achieved, and to find out about the behavior of the degenerate bose gas, we have 
to examine the limiting behavior of the functions f;4(z) in eqs.(VII.35) as z = exp(Gy) 
goes to unity. 

The functions f;*(z) are monotonically increasing with z in the interval 0 < z < 1. 
The maximum value attained at z = 1 is 


= 7 1 ane 
(een PG aa | ane. (VII.51) 


The integrand has a pole as x — 0, where it behaves as f drx™~*. Therefore, ¢,, is finite 
for m > 1 and infinite form < 1. A useful recursive property of these functions is (for 


m > 1) 


2 de 
2 (VIL.52) 
2558 if dx ae : (integrate by parts) 
9 gale (m—1)! dx \z-tet -1 a 
Lee. a? 1 ie 
=| a (m—2)!2zle™@ —1 zhin—1 (2): 


Hence, a sufficiently high derivative of f(z) will be divergent at z = 1 for all m. 
The density of excited states for the non-relativistic bose gas in three dimensions is 


thus bounded by 
g x g 


151 


At sufficiently high temperatures, such that 


net ( : 
J2rmk al 


3 
g g 
this bound is not relevant and n, = n. However, on lowering temperature, the limiting 


density of excited states is achieved at 


h2 i 2/3 
To. a ; VIL.55 
(n) 2xmkpg (=) ( 


For T < T., z gets stuck to unity (wu = 0). The limiting density of excited states, 
—— 9¢3/2/° x T°/?, is then less than the total particle density. The remaining gas 
particles, with density no = n — n*, occupy the lowest energy state with k = 0. The 
phenomenon of a macroscopic occupation of a single one-particle state is known as Bose-— 
Einstein condensation. 

The bose condensate has some unusual properties. The gas pressure for T’ < Ty, 


I 


ai (VII.56) 


g g 
BP = x3 F521) = +3 65/2 ~ 1.341 


vanishes as T°/? and is independent of density. This is because only the excited fraction 
n* has finite momentum and contributes to the pressure. Alternatively, bose condensation 
can be achieved at a fixed temperature by increasing density (reducing volume). From 


eq.(VII.54), the transition occurs at a specific volume 


1 r? 
*-— = (VIL.57) 
n 963/2 
For v < v*, the pressure-volume isotherm is flat, since 0P/Ov «x OP/On = 0 from 


eq.(VII.56). The flat portion of isotherms is reminiscent of coexisting liquid and gas 
phases. We can similarly regard bose condensation as the coexistence of a “normal gas” of 
specific volume v*, and a “liquid” of volume 0. The vanishing of the “liquid” volume is an 
unrealistic feature due to the absence of any interaction potential between the particles. 
Bose condensation combines features of discontinuous (first order), and continuous 
(second order) transitions; there is a finite latent heat while the compressibility diverges. 
The latent heat of the transition can be obtained from the Clausius—Clapeyron equation 


which gives the change of the transition temperature with pressure as 


dT _ AV _ T(v* = vp) 


7p mG — (VIL.58) 


Coexistence 


152 


Since eq.(VII.56) gives the gas pressure right up to the transition point, 


dP 5 P 


le aod VII.59 
dT Coexistence 27 ( ) 
Using the above equations we find a latent heat, 
dP 5 -1 
ee =5 Pu" =5 beTe (S562) 
: aT Coexistence 2 2 9 xr Fi rs oe 
5¢ (VII.60) 
L=~ 2 ke, = 1.28kpTo. 

2 3/2 


To find the compressibility Kr = On/OP|,/n, take derivatives of eqs.(VII.35), and 
take advantage of the identity in eq.(VII.52) to get 


dP = Gat 1 


= ais) o(2) 
dz aa (VIL.61) 
dn _ 4 
de ay ia) 
The ratio of these equations leads to 
Tae 

kp = a ; (VII.62) 

nkpT f(z) 


which diverges at the transition since lim,_,1 Fale) — oo, ie. the isotherms approach 
the flat coexistence portion tangentially. 


From the expression for energy in the grand canonical ensemble, 


3 3 
E=5PV= SVs jakeT Pepe = F532) (VII.63) 


and using eq.(VII.52), the heat capacity is obtained as 


dz 


dE n 
aptsjel*) “FE: 5 fsj2(2) dT. ae 


Cyn = 77a 


ie kpT 


(VII.64) 
vn 2 r3 


The derivative dz/dT|,, y, is found from the condition of fixed particle number, using 


dN g dz 
—| =0=—V/\|— - — II. 
ar |, = 98" larten® + hn®) al ee 
Substituting the solution 
dz = 3 f379(2) 
z aT V,N 2 fijo(2) 


into eq.(VII.64) yields 


+ 2 
OME aD 3 Fajal®) | (VIL.66) 


5 p+ 
Vin 28 Batato 2 FF) 


Expanding the result in powers of z indicates that at high temperatures the heat capacity 


is larger than the classical value; Cy /Nkp = 3/2[1+nA?/27/2+---]. At low temperatures, 


Cy 15 g bGj (T\*" 
Se oe lay ee ae VIL.67 
Nkp 4 rs o5/? 4 C32 \Te 


The origin of the T°/? behavior at low temperatures is easily understood. At T = 0 all 


z=1and 


particles occupy the k = 0 state. At small but finite temperatures there is occupation of 
states of finite momentum, up to a value of approximately k,, such that Rk, J 2nt= hel. 
Each of these states has an energy proportional to kpT. The excitation energy in d 
dimensions is thus given by Ey «x Vk¢,kgT. The resulting heat capacity is Cy x VkgT’/?. 
The reasoning is similar to that used to calculate the heat capacities of a phonon (or 
photon) gas. The difference in the power laws simply originates from the difference in 
the energy spectrum of low energy excitations (E(k) « k? in the former and €(k) « k for 
the latter). In both cases the total number of excitations is not conserved, corresponding 
to 4 = 0. For the bose gas, this lack of conservation only persists up to the transition 
temperature, at which point all particles are excited out of the reservoir with u = 0 at 
k = 0. Cy is continuous at T, c, reaching a maximum value of approximately 1.92kg per 


particle, but has a discontinuous derivative at this point. 


154 


VII.G_ Superfluid He* 


Interesting examples of quantum fluids are provided by the isotopes of helium. The 
two electrons of He occupy the Is orbital with opposite spins. The filled orbital makes 
this noble gas particularly inert. There is still a van der Waals attraction between two He 
atoms, but the interatomic potential has a shallow minimum of depth 9° at a separation 
of roughly 3A. The weakness of this interaction makes He a powerful wetting agent. As the 
He atoms have a stronger attraction for practically all other molecules, they easily spread 
over the surface of any substance. Due to its light mass, the He atom undergoes large zero 
point fluctuations at T = 0. These fluctuations are sufficient to melt a solid phase, and 
thus He remains in a liquid state at ordinary pressures. Pressures of over 25 atmosphere 
are required to sufficiently localize the He atoms to result in a solid phase at T = 0. A 
remarkable property of the quantum liquid at zero temperature is that, unlike its classical 
counterpart, it has zero entropy. 

The lighter isotope of He? has three nucleons and obeys fermi statistics. The liquid 
phase is very well described by a fermi gas of the type discussed in sec. VII.E. The inter- 
actions between atoms do not significantly change the non-interacting picture described 
earlier. By contrast, the heavier isotope of He* is a boson. Helium can be cooled down by 
a process of evaporation: Liquid helium is placed in an isolated container, in equilibrium 
with the gas phase at a finite vapor density. As the helium gas is pumped out, part of the 
liquid evaporates to take its place. The evaporation process is accompanied by the release 
of latent heat which cools the liquid. The boiling liquid is quite active and turbulent, just 
as a boiling pot of water. However, when the liquid is cooled down to below 2.2°K, it 
suddenly becomes quiescent and the turbulence disappears. The liquids on the two sides 
of this phase transition are usually referred to as Hel and Hell. 

Hell has unusual hydrodynamic properties. It flows through the finest capillaries 
without any resistance. Consider an experiment that pushes Hell from one container to 
another through a small tube packed with powder. For ordinary fluids a finite pressure 
difference between the containers, proportional to viscosity, is necessary to maintain the 
flow. Hell flows even in the limit of zero pressure difference and acts as if it has zero 
viscosity. For this reason it is referred to as a superfluid. The superflow is accompanied 
by heating of the container that loses Hell and cooling of the container that accepts 
it; the mechano-caloric effect. Conversely, HelI acts to remove temperature differences 
by flowing away from hot regions. This is the basis of the fountain effect in which the 


superfluid spontaneously moves up a tube from a heated container. 


155 


In some other circumstances HelII behaves as a viscous fluid. A classical method 
for measuring the viscosity of a liquid is via torsional oscillators: A collection of closely 
spaced disks connected to a shaft is immersed in the fluid and made to oscillate. The 
period of oscillations is proportional to the moment of inertia, which is modified by the 
quantity of fluid that is dragged by the oscillator. This experiment was performed on Hell 
by Andronikashvilli who indeed found a finite viscous drag. Furthermore, the changes in 
the frequency of the oscillator with temperature indicate that the quantity of fluid that 
is dragged by the oscillator starts to decrease below the transition temperature. The 
measured normal density vanishes as T — 0, approximately as T*. 

In 1938, Fritz London suggested that a good starting hypothesis is that the transition 
to the superfluid state is related to the Bose-Einstein condensation. This hypothesis can 
account for a number of observations: 

(1) The critical temperature of an ideal bose gas of volume v = 46.2A per particle is 
obtained from eq.(VII.55) as 
ii? 


—2/3 ° 
c= = aBAd Kk: VII.68 
2rMmyekB (v ¢3/2) ( ) 


The actual transition temperature of T, & 2.18°K is not far from this value. 

(2) The origin of the transition has to be related to quantum statistics as He®, which is 
atomically similar, but a fermion, does not have a similar transition. (Actually He? does 
become superfluid but at temperatures of only a few m°K. This follows a pairing of He® 
atoms which changes their statistics. ) 

(3) A bose condensate can account for the observed thermo—mechanical properties of 
Hell. The expression for pressure in eq.(VII.56) is only a function of temperature, and not 
density. As such, variations in pressure are accompanied by changes in temperature. This 
is also the reason for the absence of boiling activity in Hell. Bubbles nucleate and grow 
in a boiling liquid at local hot spots. In an ordinary fluid, variations in temperature relax 
to equilibrium only through the slow process of heat diffusion. By contrast if the local 
pressure is a function of temperature only, there will be increased pressure at a hot spot. 
The fluid flows in response to pressure gradients and removes them very rapidly (at the 
speed of sound in the medium). 

(4) Hydrodynamic behavior of Hell can be explained by Tisza’s two-fluid model, which 
postulates the coexistence of two components for T < Te: 

(a) A normal component of density p,, moving with velocity v,, and having a finite 


entropy density s,. 


156 


(b) A superfluid component of density p;, which flows without viscosity and with no 
vorticity (V x wu, = 0), and has zero entropy density, s, = 0. 
In the super-leak experiments, it is the superfluid component that passes through, reduc- 
ing the entropy and hence temperature. In the Andronikashilli experiment, the normal 
component sticks to the torsional oscillator and gets dragged by it. This experiment thus 
gives the ratio of py to ps. 
There are, however, many important differences between superfluid helium and the 
ideal bose condensate: 
(1) Interactions certainly play an important role in the liquid state. The Bose-Einstein 
condensate has infinite compressibility, while HelII has a finite density, related to atomic 
volume, and is essentially incompressible. 
(2) It can be shown that, even at T = 0, the ideal bose condensate is not superfluid. This 
is because the low energy spectrum, € (k) = fi? k? /2m, admits too many excitations. Any 
external body moving through such a fluid can easily lose energy to it by exciting these 
modes, leading to a finite viscosity. 
(3) The detailed functional forms of the heat capacity and superfluid density are very 
different from their counterparts in the ideal bose condensate. The measured heat capacity 
diverges at the transition with a characteristic shape similar to a A, and vanishes at low 
temperatures as T? (compared to T3/? for the ideal bose gas). The superfluid density 


2/3 at the transition, while 


obtained in the Andronikashvilli experiment vanishes as (T, —T’) 
the normal component vanishes as approximately T* as T — 0 (compared to T 3/2 for the 
condensate density in eq.(VII.53)). Understanding the nature of the singular behavior 
close to T,, is beyond the scope of this discussion. The behavior close to zero temperature, 
however, suggests a different spectrum of low energy excitations. 

Based on the shape of the experimentally measured heat capacity, Landau suggested 
that the spectrum of low energy excitations is actually similar to that of phonons. This 
is a consequence of the interactions between the particles. The low energy excitations of 
a classical liquid are longitudinal sound waves. (In comparison, a solid admits three such 
excitations, two transverse and one longitudinal.) The correspondence principle suggests 
that quantized versions of such modes should also be present in a quantum liquid. As 
in the case of phonons, a linear spectrum of excitations leads to a heat capacity that 
vanishes as T°. The speed of sound waves can be computed from the coefficient of the 
T°? dependence as v & 240ms~!. A further anomaly in heat capacity can be explained by 


assuming that the spectrum of excitations bends down and has a minimum in the vicinity 


157 


of a wavenumber ky © 2A~!. The excitations in the vicinity of this minimum are referred 


to as rotons, and have an energy 


2 
Eroton(k) = A+ =k — ko)’, (VII.69) 


with A + 8.6°K, and pw ¥ 0.16my-. The spectrum postulated by Landau was confirmed 


directly by neutron scattering measurements in the 1950’s. 


158 


